Doing XFS repair for LVM volumes that are auto mounted by cluster configuration

From Notes_Wiki
Revision as of 04:02, 10 April 2024 by Saurabh (talk | contribs) (Created page with "Home > Suse > SAP setup and maintenance > Doing XFS repair for LVM volumes that are auto mounted by cluster configuration If there is issue with xfs filesystem due to which it is not getting mounted then crm status may show errors such as: <pre> xml <rsc_order idFailed Resource Actions: * rsc_fs_ASCS02_sapEXAas_start_0 on exaprdapp01 'unknown error' (1): call=105, status=complete, exitreason='Couldn't mount device [/dev/usrsapEXAASCS02VG/usrsap...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Home > Suse > SAP setup and maintenance > Doing XFS repair for LVM volumes that are auto mounted by cluster configuration

If there is issue with xfs filesystem due to which it is not getting mounted then crm status may show errors such as:

xml <rsc_order idFailed Resource Actions:
* rsc_fs_ASCS02_sapEXAas_start_0 on exaprdapp01 'unknown error' (1): call=105, status=complete, exitreason='Couldn't mount device [/dev/usrsapEXAASCS02VG/usrsapEXAASCS02lv] as /usr/sap/EXA/ASCS02',
    last-rc-change='Wed Nov 22 10:24:31 2023', queued=0ms, exec=273ms


In such cases /var/log/messages also might have details about issue with filesystem:

2023-11-22T10:24:31.660049+05:30 exaprdapp01 kernel: XFS (dm-9): Mounting V5 Filesystem
2023-11-22T10:24:31.820036+05:30 exaprdapp01 kernel: XFS (dm-9): Starting recovery (logdev: internal)
2023-11-22T10:24:31.850066+05:30 exaprdapp01 kernel: XFS (dm-9): Internal error XFS_WANT_CORRUPTED_GOTO at line 1737 of file ../fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xd4/0x1f0 [xfs]
2023-11-22T10:24:31.850085+05:30 exaprdapp01 kernel: [c000000787f97650] [d000000008d1eeec] xfs_error_report+0x64/0x80 [xfs]
2023-11-22T10:24:31.850087+05:30 exaprdapp01 kernel: [c000000787f976b0] [d000000008cb6cf4] xfs_free_ag_extent+0x26c/0xa60 [xfs]
2023-11-22T10:24:31.850091+05:30 exaprdapp01 kernel: [c000000787f97770] [d000000008cbaafc] xfs_free_extent+0xd4/0x1f0 [xfs]
2023-11-22T10:24:31.850093+05:30 exaprdapp01 kernel: [c000000787f977f0] [d000000008d69958] xfs_trans_free_extent+0x60/0x190 [xfs]
2023-11-22T10:24:31.850096+05:30 exaprdapp01 kernel: [c000000787f97860] [d000000008d58b4c] xfs_efi_recover+0x1b4/0x218 [xfs]
2023-11-22T10:24:31.850099+05:30 exaprdapp01 kernel: [c000000787f978c0] [d000000008d5da20] xlog_recover_process_efi+0x58/0xa0 [xfs]
2023-11-22T10:24:31.850101+05:30 exaprdapp01 kernel: [c000000787f978f0] [d000000008d5dd4c] xlog_recover_process_intents+0x104/0x200 [xfs]
2023-11-22T10:24:31.850104+05:30 exaprdapp01 kernel: [c000000787f97950] [d000000008d65f50] xlog_recover_finish+0x38/0x130 [xfs]
2023-11-22T10:24:31.850108+05:30 exaprdapp01 kernel: [c000000787f979c0] [d000000008d4ff70] xfs_log_mount_finish+0x58/0x140 [xfs]
2023-11-22T10:24:31.850110+05:30 exaprdapp01 kernel: [c000000787f979f0] [d000000008d3ed40] xfs_mountfs+0x708/0xa38 [xfs]
2023-11-22T10:24:31.850112+05:30 exaprdapp01 kernel: [c000000787f97ab0] [d000000008d4759c] xfs_fs_fill_super+0x474/0x6b0 [xfs]
2023-11-22T10:24:31.850117+05:30 exaprdapp01 kernel: [c000000787f97bf0] [d000000008d44f30] xfs_fs_mount+0x28/0x50 [xfs]
2023-11-22T10:24:31.850131+05:30 exaprdapp01 kernel: XFS (dm-9): Internal error xfs_trans_cancel at line 1005 of file ../fs/xfs/xfs_trans.c.  Caller xfs_efi_recover+0x1d4/0x218 [xfs]
2023-11-22T10:24:31.850140+05:30 exaprdapp01 kernel: [c000000787f977c0] [d000000008d1eeec] xfs_error_report+0x64/0x80 [xfs]
2023-11-22T10:24:31.850142+05:30 exaprdapp01 kernel: [c000000787f97820] [d000000008d4bc9c] xfs_trans_cancel+0x104/0x130 [xfs]
2023-11-22T10:24:31.850144+05:30 exaprdapp01 kernel: [c000000787f97860] [d000000008d58b6c] xfs_efi_recover+0x1d4/0x218 [xfs]
2023-11-22T10:24:31.850145+05:30 exaprdapp01 kernel: [c000000787f978c0] [d000000008d5da20] xlog_recover_process_efi+0x58/0xa0 [xfs]
2023-11-22T10:24:31.850147+05:30 exaprdapp01 kernel: [c000000787f978f0] [d000000008d5dd4c] xlog_recover_process_intents+0x104/0x200 [xfs]
2023-11-22T10:24:31.850149+05:30 exaprdapp01 kernel: [c000000787f97950] [d000000008d65f50] xlog_recover_finish+0x38/0x130 [xfs]
2023-11-22T10:24:31.850151+05:30 exaprdapp01 kernel: [c000000787f979c0] [d000000008d4ff70] xfs_log_mount_finish+0x58/0x140 [xfs]
2023-11-22T10:24:31.850152+05:30 exaprdapp01 kernel: [c000000787f979f0] [d000000008d3ed40] xfs_mountfs+0x708/0xa38 [xfs]
2023-11-22T10:24:31.850154+05:30 exaprdapp01 kernel: [c000000787f97ab0] [d000000008d4759c] xfs_fs_fill_super+0x474/0x6b0 [xfs]
2023-11-22T10:24:31.850158+05:30 exaprdapp01 kernel: [c000000787f97bf0] [d000000008d44f30] xfs_fs_mount+0x28/0x50 [xfs]
2023-11-22T10:24:31.850169+05:30 exaprdapp01 kernel: XFS (dm-9): xfs_do_force_shutdown(0x8) called from line 1006 of file ../fs/xfs/xfs_trans.c.  Return address = 0xd000000008d4bcb4
2023-11-22T10:24:31.850172+05:30 exaprdapp01 kernel: XFS (dm-9): Corruption of in-memory data detected.  Shutting down filesystem
2023-11-22T10:24:31.850174+05:30 exaprdapp01 kernel: XFS (dm-9): Please umount the filesystem and rectify the problem(s)
2023-11-22T10:24:31.850176+05:30 exaprdapp01 kernel: XFS (dm-9): Failed to recover intents
2023-11-22T10:24:31.850178+05:30 exaprdapp01 kernel: XFS (dm-9): log mount finish failed
2023-11-22T10:35:22.661744+05:30 exaprdapp01 kernel: XFS (dm-9): Mounting V5 Filesystem
2023-11-22T10:35:22.678953+05:30 exaprdapp01 kernel: XFS (dm-9): Starting recovery (logdev: internal)
2023-11-22T10:35:22.690052+05:30 exaprdapp01 kernel: XFS (dm-9): Internal error XFS_WANT_CORRUPTED_GOTO at line 1737 of file ../fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xd4/0x1f0 [xfs]
2023-11-22T10:35:22.690067+05:30 exaprdapp01 kernel: [c000000734bd3650] [d000000008d1eeec] xfs_error_report+0x64/0x80 [xfs]
2023-11-22T10:35:22.690068+05:30 exaprdapp01 kernel: [c000000734bd36b0] [d000000008cb6cf4] xfs_free_ag_extent+0x26c/0xa60 [xfs]
2023-11-22T10:35:22.690070+05:30 exaprdapp01 kernel: [c000000734bd3770] [d000000008cbaafc] xfs_free_extent+0xd4/0x1f0 [xfs]
2023-11-22T10:35:22.690072+05:30 exaprdapp01 kernel: [c000000734bd37f0] [d000000008d69958] xfs_trans_free_extent+0x60/0x190 [xfs]
2023-11-22T10:35:22.690074+05:30 exaprdapp01 kernel: [c000000734bd3860] [d000000008d58b4c] xfs_efi_recover+0x1b4/0x218 [xfs]


To solve this use:

  1. Put running node into maintenance mode after powering off other node to avoid any complex errors:
    crm node maintenance exaprdapp01
  2. Manually activate volume group having logical volume with errors:
    vgchange -a y usrsapEXAASCS02VG
  3. Do dry run of repairing the damaged XFS filesystem:
    xfs_repair -n /dev/usrsapEXAASCS02VG/usrsapEXAASCS02lv
    Normally it will print lot of errors based on the errors.
  4. We can test mount the filesystem manually and see that repair is really required:
    mkdir /mnt/test1
    mount /dev/usrsapEXAASCS02VG/usrsapEXAASCS02lv /mnt/test1/
    The above mount should fail for the filesystem with errors such as 'mount: /mnt/test1: mount(2) system call failed: Structure needs cleaning.'
  5. We can do DD backup of the filesystem before repair
    mkdir /var/usrsapEXAASCS02VG-2023-11-22-dd-backup-before-repair
    vgchange -a n usrsapEXAASCS02VG
    vgdisplay -v usrsapEXAASC02VG #note the physical volume
    fdisk -l
    dd if=/dev/sde of=/var/usrsapEXAASCS02VG-2023-11-22-dd-backup-before-repair/dev-sde.raw status=progress
  6. Now again activate VG and do xfs_repair
    vgchange -a y usrsapEXAASCS02VG
    xfs_repair -L /dev/usrsapEXAASCS02VG/usrsapEXAASCS02lv
    mount /dev/usrsapEXAASCS02VG/usrsapEXAASCS02lv /mnt/test1/
    df -h
    After repair the mount should succeed
  7. Assuming there are no other filesystems requiring repair proceed further. You can see various filesystem as part of lvm commands output or crm configuration. We can try to mount those on some test folder after manually activating the volume.
  8. Umount filesystem, remove node from maintenance mode and reboot
    umount /mnt/test1
    crm node ready exaprdapp01
    shutdown -r now
  9. Dont forget to delete the backup taken in '/var/usrsapEXAASCS02VG-2023-11-22-dd-backup-before-repair/' when it is not needed anymore.


Home > Suse > SAP setup and maintenance > Doing XFS repair for LVM volumes that are auto mounted by cluster configuration