Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to mount previous written tape #446

Closed
heinowalther opened this issue Mar 26, 2024 · 14 comments
Closed

Unable to mount previous written tape #446

heinowalther opened this issue Mar 26, 2024 · 14 comments
Labels

Comments

@heinowalther
Copy link

heinowalther commented Mar 26, 2024

Unable to mount previously written tape from older version of LTFS. (written about 1 year ago, cannot remember ltfs version)
Tried with different tapes with same result...
(The tapes are set at read-only, can that be the issue?)
Ubuntu 22.04LTS, blacklisted the "st" driver.
LTFS version 2.5.0.0 (Prelim).
LTFS Format Specification version 2.4.0.
IBM ULTRIUM-HH8

This is how I mount...

ltfs -o devname=/dev/sg29 /mnt/lto
31b9 LTFS14000I LTFS starting, LTFS version 2.5.0.0 (Prelim), log level 2.
31b9 LTFS14058I LTFS Format Specification version 2.4.0.
31b9 LTFS14104I Launched by "ltfs -o devname=/dev/sg29 /mnt/lto".
31b9 LTFS14105I This binary is built for Linux (x86_64).
31b9 LTFS14106I GCC version is 11.4.0.
31b9 LTFS17087I Kernel version: Linux version 5.15.0-101-generic (buildd@lcy02-amd64-032) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 i386.
31b9 LTFS17089I Distribution: DISTRIB_ID=Ubuntu.
31b9 LTFS17089I Distribution: PRETTY_NAME="Ubuntu 22.04.4 LTS".
31b9 LTFS14063I Sync type is "time", Sync time is 300 sec.
31b9 LTFS17085I Plugin: Loading "sg" tape backend.
31b9 LTFS17085I Plugin: Loading "unified" iosched backend.
31b9 LTFS14095I Set the tape device write-anywhere mode to avoid cartridge ejection.
31b9 LTFS30209I Opening a device through sg-ibmtape driver (/dev/sg29).
31b9 LTFS30250I Opened the SCSI tape device 10.0.0.0 (/dev/sg29).
31b9 LTFS30207I Vendor ID is IBM     .
31b9 LTFS30208I Product ID is ULTRIUM-HH8     .
31b9 LTFS30214I Firmware revision is J4D1.
31b9 LTFS30215I Drive serial is 10WT021725.
31b9 LTFS30285I The reserved buffer size of /dev/sg29 is 1048576.
31b9 LTFS30294I Setting up timeout values from RSOC.
31b9 LTFS17160I Maximum device block size is 1048576.
31b9 LTFS11330I Loading cartridge.
31b9 LTFS30252I Logical block protection is disabled.
31b9 LTFS11332I Load successful.
31b9 LTFS17157I Changing the drive setting to write-anywhere mode.
31b9 LTFS11005I Mounting the volume from device.
31b9 LTFS30252I Logical block protection is disabled.
31b9 LTFS11022I Restoring volume consistency by writing an index to the index partition.
31b9 LTFS14013E Cannot mount the volume from device.
31b9 LTFS30252I Logical block protection is disabled.
@heinowalther

This comment was marked as abuse.

@heinowalther
Copy link
Author

Just an update... it seems to work with tapes created with the LTFS version currently running...
Maybe there is some compatibility issue? any suggestions?

@piste-jp-ibm
Copy link
Member

It is not a compatibility issue at all.

It looks LTFS found inconsistency of the tape (logical inconsistency). A typical scenario is stop machine without unmouting LTFS.

@heinowalther
Copy link
Author

OK, but is there a way to fix it? If I remove the write-protection on the take, will it fix the issue while mounting, or do we need to run ltfsck ?
Side note: I am pretty sure we unmounted and ejected the tape "correctly" back when we wrote to the tapes, so maybe there was a bug in the version we used? Seems odd that the 5 times we have tested with so far, has this issue...

@piste-jp-ibm
Copy link
Member

so maybe there was a bug in the version we used?

I don't think so. I have never saw such kind of problems at all.

Seems odd that the 5 times we have tested with so far, has this issue...

I'm little bit confused. Because you stated you never remember which version was used for write.

Send a pair of log, write side and read side if you can recreate the problem. I would like to have log which is out under /var/log, not a log printed to a screen.

@heinowalther
Copy link
Author

I cannot find any ltfs related logfiles in /var/log ?
I will try to mount one of the tapes again and give it a go, without the write-protection.
The reason we do no know which version we have written the tapes with, is because we have decommissioned the old server. But we have restored files from the tapes back on the original server.

@heinowalther
Copy link
Author

heinowalther commented Mar 27, 2024

Well we managed to mount a few tapes by just removing the write-protection... but we have one which seems to have other issues...
Looks like it cannot find the label of the tape: Cannot read ANSI label: expected 80 bytes, but received 0.
We have tried ltfsck with a few options, but it complets fairly fast with a similar issue about the label...
The below mount command also returns error fast, so it does not seem to move the tape...

ltfs -o verbose=606 -o devname=/dev/sg29 /mnt/lto
2ce5 LTFS14000I LTFS starting, LTFS version 2.5.0.0 (Prelim), log level 606.
2ce5 LTFS14058I LTFS Format Specification version 2.4.0.
2ce5 LTFS14104I Launched by "ltfs -o verbose=606 -o devname=/dev/sg29 /mnt/lto".
2ce5 LTFS14105I This binary is built for Linux (x86_64).
2ce5 LTFS14106I GCC version is 11.4.0.
2ce5 LTFS17087I Kernel version: Linux version 5.15.0-101-generic (buildd@lcy02-amd64-032) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 i386.
2ce5 LTFS17089I Distribution: DISTRIB_ID=Ubuntu.
2ce5 LTFS17089I Distribution: PRETTY_NAME="Ubuntu 22.04.4 LTS".
2ce5 LTFS14063I Sync type is "time", Sync time is 300 sec.
2ce5 LTFS17085I Plugin: Loading "sg" tape backend.
2ce5 LTFS17085I Plugin: Loading "unified" iosched backend.
2ce5 LTFS14095I Set the tape device write-anywhere mode to avoid cartridge ejection.
2ce5 LTFS30209I Opening a device through sg-ibmtape driver (/dev/sg29).
2ce5 LTFS30250I Opened the SCSI tape device 10.0.0.0 (/dev/sg29).
2ce5 LTFS30207I Vendor ID is IBM     .
2ce5 LTFS30208I Product ID is ULTRIUM-HH8     .
2ce5 LTFS30214I Firmware revision is J4D1.
2ce5 LTFS30215I Drive serial is 10WT021725.
2ce5 LTFS30285I The reserved buffer size of /dev/sg29 is 1048576.
2ce5 LTFS30294I Setting up timeout values from RSOC.
2ce5 LTFS39801D SCSI timeout (op_code 0x5f, timeout = 60).
2ce5 LTFS12023D Reserving device.
2ce5 LTFS30392D Backend reserve (PRO) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5f, timeout = 60).
2ce5 LTFS12028D Unlocking medium.
2ce5 LTFS30392D Backend allow medium removal 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x1e, timeout = 60).
2ce5 LTFS30392D Backend read block limits 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x05, timeout = 60).
2ce5 LTFS17160I Maximum device block size is 1048576.
2ce5 LTFS11330I Loading cartridge.
2ce5 LTFS30392D Backend load 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x1b, timeout = 960).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS30393D Backend modesense: 63 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS12026D Locking medium in the drive.
2ce5 LTFS30392D Backend prevent medium removal 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x1e, timeout = 60).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS30392D Backend sg_set_default Resetting LBP.
2ce5 LTFS30393D Backend LBP Enable: 0 .
2ce5 LTFS30393D Backend LBP Method: 2 .
2ce5 LTFS30393D Backend modesense: 10 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend modeselect 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x55, timeout = 60).
2ce5 LTFS30252I Logical block protection is disabled.
2ce5 LTFS30397D Backend logsense: (23, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x4d, timeout = 60).
2ce5 LTFS30397D Backend capacity part0: (110032, 110038) 10WT021725.
2ce5 LTFS30397D Backend capacity part1: (537531, 11224012) 10WT021725.
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend read block limits 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x05, timeout = 60).
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS11332I Load successful.
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS30397D Backend logsense: (23, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x4d, timeout = 60).
2ce5 LTFS30397D Backend capacity part0: (110032, 110038) 10WT021725.
2ce5 LTFS30397D Backend capacity part1: (537531, 11224012) 10WT021725.
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend modeselect 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x55, timeout = 60).
2ce5 LTFS17157I Changing the drive setting to write-anywhere mode.
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS11005I Mounting the volume from device.
2ce5 LTFS11012D Loading the tape.
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS30392D Backend load 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x1b, timeout = 960).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS30393D Backend modesense: 63 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS30392D Backend sg_set_default Resetting LBP.
2ce5 LTFS30393D Backend LBP Enable: 0 .
2ce5 LTFS30393D Backend LBP Method: 2 .
2ce5 LTFS30393D Backend modesense: 10 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend modeselect 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x55, timeout = 60).
2ce5 LTFS30252I Logical block protection is disabled.
2ce5 LTFS30397D Backend logsense: (23, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x4d, timeout = 60).
2ce5 LTFS30397D Backend capacity part0: (110032, 110038) 10WT021725.
2ce5 LTFS30397D Backend capacity part1: (537531, 11224012) 10WT021725.
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend read block limits 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x05, timeout = 60).
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30397D Backend locate: (0, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x92, timeout = 2940).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS11007D Tape is loaded.
2ce5 LTFS30397D Backend logsense: (23, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x4d, timeout = 60).
2ce5 LTFS30397D Backend capacity part0: (110032, 110038) 10WT021725.
2ce5 LTFS30397D Backend capacity part1: (537531, 11224012) 10WT021725.
2ce5 LTFS11008D Reading partition labels.
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend read block limits 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x05, timeout = 60).
2ce5 LTFS30397D Backend locate: (0, 0) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x92, timeout = 2940).
2ce5 LTFS39801D SCSI timeout (op_code 0x34, timeout = 60).
2ce5 LTFS30398D Backend readpos: (0, 0) FM = 0 10WT021725.
2ce5 LTFS30395D Backend read: 4096 bytes 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x08, timeout = 2340).
2ce5 LTFS30201D CDB check condition: sense = 000001, Filemark Detected.
2ce5 LTFS30204D READ (0x08) expected error -20004.
2ce5 LTFS30219D Read block: file mark detected.
2ce5 LTFS11175E Cannot read ANSI label: expected 80 bytes, but received 0.
2ce5 LTFS11170E Failed to read label (-1012) from partition 0.
2ce5 LTFS11009E Cannot read volume: failed to read partition labels.
2ce5 LTFS14013E Cannot mount the volume from device.
2ce5 LTFS12028D Unlocking medium.
2ce5 LTFS30392D Backend allow medium removal 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x1e, timeout = 60).
2ce5 LTFS30392D Backend test unit ready 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x00, timeout = 60).
2ce5 LTFS30393D Backend modesense: 16 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend modeselect 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x55, timeout = 60).
2ce5 LTFS12025D Releasing device.
2ce5 LTFS30392D Backend release (PRO) 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5f, timeout = 60).
2ce5 LTFS30393D Backend LBP Enable: 0 .
2ce5 LTFS30393D Backend LBP Method: 2 .
2ce5 LTFS30393D Backend modesense: 10 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x5a, timeout = 60).
2ce5 LTFS30392D Backend modeselect 10WT021725.
2ce5 LTFS39801D SCSI timeout (op_code 0x55, timeout = 60).
2ce5 LTFS30252I Logical block protection is disabled.
2ce5 LTFS39801D SCSI timeout (op_code 0x5f, timeout = 60).

@piste-jp-ibm
Copy link
Member

I cannot find any ltfs related logfiles in /var/log ? I will try to mount one of the tapes again and give it a go, without the write-protection. The reason we do no know which version we have written the tapes with, is because we have decommissioned the old server. But we have restored files from the tapes back on the original server.

LTFS calls syslog with default (LOG_USER), you need to configure syslog (and logrotate) setting on your machine.

Sample is available under docs directory.

@piste-jp-ibm
Copy link
Member

Well we managed to mount a few tapes by just removing the write-protection... but we have one which seems to have other issues... Looks like it cannot find the label of the tape: Cannot read ANSI label: expected 80 bytes, but received 0. We have tried ltfsck with a few options, but it complets fairly fast with a similar issue about the label... The below mount command also returns error fast, so it does not seem to move the tape...

I don't think this is not similar to the original problem at all. In this case, the first block in the partition 0 seems to be overwritten by a FM. Someone must overwrite it, LTFS never write a FM at the beginning of a partition at all!

It looks this is similar to #431. Did you check the drive condition by mt command through st device? It is really dangerous because st device rewind the tape automatically on close, and this operation finally corrupts the data sequence on tape from LTFS format spec perspective.

@heinowalther
Copy link
Author

We might not have been aware of the dangers of the st driver. I did read the #431 post and have blocked the st driver.
I guess there is no way to read the tape to potentially fix it? If it's only the fist block that have been overwritten, maybe it's possible to read the whole tape to reconstruct the tape?

@piste-jp-ibm
Copy link
Member

Need to inspect all data on tape first.

I strongly recommends to rewrite it if you have another copy on disk or another tape.

@heinowalther
Copy link
Author

We have the data on disk as well, but "non-spinning" disks, so it's a bit of an effort to get them setup again :-)
Because most of the tapes were written to with the same procedure, we are pretty sure that it is most likely only the first positions on the tape. We will most likely just rewrite the tape, but if there is a way with ltfsck maybe? We would also like to test that? But all our efforts with ltfsck fail, maybe we are not using the right options?

@piste-jp-ibm
Copy link
Member

This kind of corruption is out-of-scope of ltfsck. Because the label, ANSI label and LTFS label is written only once at format by mkltfs.

Indeed, you could fix the tapes if you have a strong confidence that the label is simply just overwritten. But you need to do it yourself by issuing low level SCSI command with tape drive's specification documents. (I don't think I can spent my time to help you in this area for free.)

Finally, I will give you a couple of suggestions.

  1. Use -o eject to the mount command like ltfs -o devname=/dev/sg5 -o eject /mnt/ltfs. This option enables append only mode to avoid this kind of disaster using the append-only feature on IBM tape drive. (But some error might be reported)
  2. Add one more step at the end of your procedure to check the tape is mountable

@piste-jp-ibm
Copy link
Member

Close because of no activity for a long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants