Error Correction

Finding Missing Blocks when the format has Inter Block/Record Gaps

--------------------------------------------------------------------------------------------------

We do have several methods so far method to correct (or extract as much as possible) from bad blocks, or blocks with any kind of error, or missing data pulse gap.  We like to try them in this order:

Method 1: Block Splicing from Multiple Adjusted Track Reads

Our Universal QIC Reader hardware has a fine-tuning track alignment adjustment, as demonstrated in this video on our Hardware page.

Here are several (very long) videos that demonstrate the minutiae of this process:


If this method fails, we move onto Method 2.

Method 2: Bad Block Re-Assembly

In this way, we are able to restore as much data as possible, sacrificing very little.

The concept is to locate the point of the error in the block, and then reconstruct the bytes both before and after the error point.

Our programs already translate the bytes from the block preamble to the error point, so by putting them together, if the size of the data block is known, then at least the quantity of missing bytes can be determined.  

Sometimes, based on the context of the bytes, we can make educated guesses as to what the missing bytes should be, and test each guess against the CRC, to see which one is correct.

But how do you determine the bytes after the error point if you don't know what data pulse to start counting at?  Well, this is where calculating all of the possibilities comes in, and then choosing the result with no errors, or the fewest errors.

Since (so far) we have read GCR4/5 formatted data.  That means that every nybble (half of a byte) is made up of 5 binary digits.  This means that the start point after the "unreadable" section, would be in one of 5 locations.  If we get it wrong, we get garbage, or "bitwise shift".  But one of them should be right.  So, our programs calculate all 5 options, and show the results.

Therefore, one can look at the 5 results, and usually, the correct one becomes obvious.

If there are more than one error points in the data block, then we run this process after the first error point, and see if we can make sense of the pieces.  The process becomes exponentially difficult the more error points there are within a single data block.  So, we just cross our fingers and hope that there's only one.  Most of the time, this is the case.

It's a long and arduous process, but nobody else has ever read "unreadable" QIC tapes before, and documented it, so we are glad to do this.

So, looking a bit deeper, these are the steps our programs do to perform this.

1) Identify the error in the block, either using our programs, or visually in the GUI of the logic analyzer capture file.  

Our first program in the Bytes Translation Process, called "Decoding to Data Block Strings", creates a verbose log file, which identifies the time between every data pulse, (or magnetic flux transition), and the string-representation of the interpreted binary value of that time between pulses.  We call this the "Master Decode Log", and split it up between every 20 identifiable data blocks.

The point is to identify a unique line in this Master Decode Log, right after the error point.

2) Run the decode process 5 times, each time Shifting the start point starting at that point in time in the log, one pulse to the right.

3) Compare the results visually in an output log, and pick the one with no errors, or the fewest errors.

4) Copy and paste the first part of the block, before the error point, into a hex editor.  Whenever a data block start marker is identifiable, our Data Bytes Block Report always shows the bytes in the data block up to the point of the error.

5)  Put in some "padding" bytes after the first part of the block, to "mark" the point where the error is.  

6)  Copy the chosen block-section from the "shift decisions" log file, and paste it in the hex editor, after your error point "mark"

7) Adjust the "mark-padding" to the exact size needed to make the entire block the size that it is supposed to be, according to the data format.  That's always 512 bytes for QIC-11 & QIC-24.  Kennedy 64xx format seems to vary, but it is in logical lengths at least.  Logical to whom I'm not certain, but there are wide variations.  (Our case study video actually demonstrates where this flexible data block size can be a nasty unknown).

8) Start guessing what the missing bytes *might* be.  If you get lucky, there are recognizable patterns to try or complete, or a limited set of nybbles or bytes that would belong in that section of the data.

9) Test your various options against the CRC (this will end up being a whole different process, documented later and elsewhere).

Here is a video of my use of this system to restore the majority of a bad data block:



--------------------------------------------------------------------------------------------------


--------------------------------------------------------------------------------------------------

This process runs two separate programs in succession.  Here is the bulk of the code from each.  Again, there are written in LotusScript:

Block Repair Step 1 v4

Sub Initialize

TheSource = "C:\Temp\TheSource.txt"
Open TheSource For Input As 90
Do While Not Eof(90)
Line Input #90, OneLine
If Instr(OneLine, "ASourceFileName =") > 0 Then
ASourceFileName = Mid(OneLine,19,999)
End If
If Instr(OneLine, "FilePath = ") > 0 Then
FilePath =  Mid(OneLine,12,999)
End If
If Instr(OneLine, "BadBlockStartMarker = ") > 0 Then
BadBlockStartMarker = Mid(OneLine,23,999)
End If
If Instr(OneLine, "BBSM = ") > 0 Then
BBSM = Mid(OneLine,8,999)
End If
If Instr(OneLine, "MDLBlockEnd = ") > 0 Then
MDLBlockEnd = Mid(OneLine,15,999)
End If
Loop
Close 90


ASourceFileName = ASourceFileName + " - Master Decode Log - " + MDLBlockEnd

OutputFileName9 = FilePath + BBSM + " Repair S1.txt"

SourceFileName = FilePath + ASourceFileName

Dim dateTime As New NotesDateTime("12/01/1900")
Dim session As New NotesSession
Set Db = session.currentdatabase

Open SourceFileName For Input As 1
Open OutputFileName9 For Output As 9

StartCapture = 0

Do While Not Eof(1)
LineCount = LineCount + 1
Line Input #1, OneLine 

If Instr(OneLine,"Average Preable Pulse Time =") Then
StartCapture = 0
End If

If StartCapture = 1 Then
If Instr(OneLine,"Potential Data Block without Preamble") = 0 Then
If Instr(OneLine,"Discarding pulse duration") = 0 Then
If Instr(OneLine,",") Then
BlockString = BlockString + Mid(OneLine,1,Instr(OneLine,",")-1)
End If
End If
End If
End If

If OneLine = BadBlockStartMarker Then
StartCapture = 1
End If

PreviousLIne = OneLine
TheNextLine:
Loop

Print #9, BlockString

Close



End Sub

Block Repair Step 2 Kennedy v2 

(we have a different version for both QIC-11 & QIC-24 formats, since the end-block anatomy of these formats are all different, but the base logic process is all the same)

Sub Initialize
NybblesAfterLastX = 0
' in Nybbles


TheSource = "C:\Temp\TheSource.txt"
Open TheSource For Input As 90
Do While Not Eof(90)
Line Input #90, OneLine
If Instr(OneLine, "ASourceFileName =") > 0 Then
ASourceFileName = Mid(OneLine,19,999)
End If
If Instr(OneLine, "FilePath = ") > 0 Then
FilePath =  Mid(OneLine,12,999)
End If
If Instr(OneLine, "BadBlockStartMarker = ") > 0 Then
BadBlockStartMarker = Mid(OneLine,23,999)
End If
If Instr(OneLine, "BBSM = ") > 0 Then
BBSM = Mid(OneLine,8,999)
End If
If Instr(OneLine, "MDLBlockEnd = ") > 0 Then
MDLBlockEnd = Mid(OneLine,15,999)
End If
If Instr(OneLine, "GoThruXes = ") > 0 Then
GoThruXes = Mid(OneLine,13,999)
End If

Loop
Close 90

Dim dateTime As New NotesDateTime("12/01/1900")
Dim session As New NotesSession
Set Db = session.currentdatabase

Call ProcessShifts

If GoThruXes = "Yes" Then 'Run it again without this
GoThruXes = "No"
Call ProcessShifts
End If

End Sub

Sub GCR45Lookup
HexNybble = "X" ' Unidentified Character...problem!
'So, if one of the below doesn't reset this, this sub-routine will return this value, indicating a problem.
If ThisNybble = "11001" Then
HexNybble = "0"
End If
If ThisNybble = "11011" Then
HexNybble = "1"
End If
If ThisNybble = "10010" Then
HexNybble = "2"
End If
If ThisNybble = "10011" Then
HexNybble = "3"
End If
If ThisNybble = "11101" Then
HexNybble = "4"
End If
If ThisNybble = "10101" Then
HexNybble = "5"
End If
If ThisNybble = "10110" Then
HexNybble = "6"
End If
If ThisNybble = "10111" Then
HexNybble = "7"
End If
If ThisNybble = "11010" Then
HexNybble = "8"
End If
If ThisNybble = "01001" Then
HexNybble = "9"
End If
If ThisNybble = "01010" Then
HexNybble = "A"
End If
If ThisNybble = "01011" Then
HexNybble = "B"
End If
If ThisNybble = "11110" Then
HexNybble = "C"
End If
If ThisNybble = "01101" Then
HexNybble = "D"
End If
If ThisNybble = "01110" Then
HexNybble = "E"
End If
If ThisNybble = "01111" Then
HexNybble = "F"
End If
If ThisNybble = "11111" Then ' Preamble / Postamble
HexNybble = "P"
End If
If ThisNybble = "00101" Then ' Filemark Byte
HexNybble = "M"
End If
If ThisNybble = "^^^^^" Then ' My internal end of block marker
HexNybble = "-"
End If
If ThisNybble = "~~~~~" Then ' My internal end of block marker
HexNybble = "+"
End If
End Sub


Sub ProcessShifts
SourceFileName = FilePath + BBSM + " Repair S1.txt"
OutputFileName9 = FilePath + ASourceFileName + " - Final Remaining Hex Block.txt"
Open OutputFileName9 For Output As 9
If GoThruXes = "Yes" Then GoThruXesNamer = " Regardless of Xes"
OutputFileName8 = SourceFileName + " - Shift Decision" + GoThruXesNamer + ".txt"
Open SourceFileName For Input As 1
Open OutputFileName8 For Output As 8
Count = 0
LineCount = 0
Do While Not Eof(1)
LineCount = LineCount + 1
Line Input #1, OneLine 
PostambleAddress = Instr(OneLine,"1111111111111111")
Print #8, "Block Remainder Full Length =" + Str(PostambleAddress)  + " (" + Str(PostambleAddress/5) + " Nybbles )"
Print #8, ""
If Instr(OneLine,"X" ) Then
For a = 1 To Len(OneLine) - 1
XLocation = Instr(Mid(OneLine,a,99999),"X" ) 
If XLocation = 0 Then
LastXLocation = a-1
Print #8, "Last X Location =" + Str(a-1)
Print #8, ""
Goto NoMoreXes
End If
Next a
NoMoreXes:
End If
Print #8, "Start Nybbles After Last X =" + Str(NybblesAfterLastX)
Print #8, ""
SampleStartConsideration = LastXLocation + (NybblesAfterLastX + 5)
If GoThruXes = "Yes" Then
Print #8, "But GoThruXes = Yes, so we're doing it all"
Print #8, ""
SampleStartConsideration = 0
End If
SampleConsideration = PostambleAddress - SampleStartConsideration
NybbleConsiderationCount = SampleConsideration / 5
Print #8, "NybbleConsiderationCount =" + Str(NybbleConsiderationCount)
Print #8, ""
SampleSize = Int(NybbleConsiderationCount) - 1
Print #8, "Sample Size = " + Str(SampleSize)
Print #8, ""
SampleLength = (SampleSize +1) * 5
StartSample = PostambleAddress-(SampleSize * 5)
ReverseStartSample = Mid(OneLine,StartSample,SampleLength)
Print #8, ReverseStartSample
Print #8, ""
For s = 1 To 5
HexBlock = ""
StartPoint = s
Print #8, "Start Point = " + Str(StartPoint)
For i = 1 To SampleSize
ThisNybble = Mid(ReverseStartSample,StartPoint,5)
Call GCR45Lookup
HexBlock = HexBlock + HexNybble
StartPoint = StartPoint + 5
Next i
Print #8, HexBlock
''''start Kennedy-specific format conditions
If Mid(HexBlock,Len(HexBlock)-4,1) = "P" Then
DataSegment = Mid(HexBlock,1,Len(HexBlock)-5)
If Instr(DataSegment, "X") Then
Print #8, "Option = NO:  X nybbles are found"
Else 
If Instr(DataSegment, "P") Then
Print #8, "Option = NO:  P nybbles are found where they shouldn't be"
Else 
If Instr(DataSegment, "S") Then
Print #8, "Option = NO:  S nybbles are found"
Else
If Instr(DataSegment, "M") Then
Print #8, "Option = NO:  M nybbles are found"
Else
Print #8, "Option = Possible!"
End If
End If
End If
End If
Else
Print #8, "Option = NO:  there is no 'P End Mark' before the CRC"
End If
Print #8, ""
Next s
Print #8, ""
Print #8, "The last nybble of StartPoint 5 should always be an 'F'"
TheNextLine:
Loop
Close
End Sub

-------------------------------------------

Dwight Elvey shares some insights into his own experience with data correction using CRCs:

https://plus.google.com/109824230561004948952/posts/3FDebRVjSJ1

No comments:

Post a Comment