Title: | LSM |
Moderator: | SMURF::SHIDERLY |
Created: | Mon Jan 17 1994 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 803 |
Total number of notes: | 2852 |
Hi, I am not sure if we have a problem or not but something has me confused in the volprint and volstat outputs I am seeing after a unclean shutdown. The system is a 8400 running V4.0a Digital UNIX. The system went down abnormally so hence the LSM volumes needing resynched. However it is unclear to me as to what is taking place. On prior versions and in all prior cases I have seen one of the plexes is always in a sync state and WO state while being resyched. However in this case both plexes are in ENABLED/ACTIVE rw state and the volume is only in an ENABLED/SYNC state. What is going on? The volstat is telling me it is reading and writing to both plexes. Can you explain if this is a problem or not? If not, what is taking place? Has a new synching algorithm been introduced that I am not aware of? Thanks in advance for the explanation. Debbie Trenta Lucent/AT&T Platinum Services Support Colorado CSC The data follows: Starting secondary cpu 5 LSM: Resynchronization of volume rootvol in group rootdg started. /sbin/ufs_fsck -p /dev/rvol/rootdg/rootvol /dev/rvol/rootdg/rootvol: 1913 files, 72516 used, 72499 free (115 frags, 9048 bl ocks, 0.1% fragmentation) starting LSM LSM: Resynchronization of volume vol-rz49h in group rootdg started. LSM: Resynchronization of volume vol-rz49g in group rootdg started. Checking local filesystems /sbin/ufs_fsck -p /dev/rvol/rootdg/rootvol: 1913 files, 72516 used, 72499 free (115 frags, 9048 bl ocks, 0.1% fragmentation) /dev/rvol/rootdg/vol-rz49g: 8794 files, 192392 used, 1746318 free (1326 frags, 2 18124 blocks, 0.1% fragmentation) LSM: Resynchronization of volume rootvol in group rootdg finished. /dev/rvol/rootdg/vol-rz49h: 507 files, 5892 used, 1680586 free (554 frags, 21000 4 blocks, 0.0% fragmentation) Mounting / (root) ..... Binary error logger started Starting ASE ... Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume rootvol in group rootdg started. Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume vol-rz49h in gro up rootdg started. ONC portmap service started Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume vol-rz49g in gro up rootdg started. Initializing the ASE Availability Manager Apr 29 13:45:43 sd9801 vmunix: LSM: Resynchronization of volume rootvol in group rootdg finished. ASE logger started (/usr/sbin/aselogger) ASE agent started (/usr/sbin/aseagent) ASE member started Setting kernel timezone variable .... sd9801# volprint -ht DG NAME GROUP-ID DM NAME DEVICE TYPE PRIVLEN PUBLEN PUBPATH V NAME USETYPE KSTATE STATE LENGTH READPOL PREFPLEX PL NAME VOLUME KSTATE STATE LENGTH LAYOUT ST-WIDTH MODE SD NAME PLEX PLOFFS DISKOFFS LENGTH DISK-NAME DEVICE dg rootdg 862325304.1025.sd9801 dm rz49a rz49a nopriv 0 300000 /dev/rrz49a dm rz49b rz49b nopriv 0 598976 /dev/rrz49b dm rz49d rz49d simple 1024 0 /dev/rrz49d dm rz49g rz49g nopriv 0 4000000 /dev/rrz49g dm rz49h rz49h nopriv 0 3480080 /dev/rrz49h dm rz65a rz65a nopriv 0 300000 /dev/rrz65a dm rz65b rz65b nopriv 0 598976 /dev/rrz65b dm rz65d rz65d simple 1024 0 /dev/rrz65d dm rz65g rz65g nopriv 0 4000000 /dev/rrz65g dm rz65h rz65h nopriv 0 3480080 /dev/rrz65h v rootvol root ENABLED ACTIVE 300000 ROUND - pl rootvol-01 rootvol ENABLED ACTIVE 300000 CONCAT - RW sd rz49a-01p rootvol-01 0 0 16 rz49a rz49a sd rz49a-01 rootvol-01 16 16 299984 rz49a rz49a pl rootvol-02 rootvol ENABLED ACTIVE 300000 CONCAT - RW sd rz65a-01p rootvol-02 0 0 16 rz65a rz65a sd rz65a-01 rootvol-02 16 16 299984 rz65a rz65a v swapvol swap ENABLED ACTIVE 598976 ROUND - pl swapvol-01 swapvol ENABLED ACTIVE 598976 CONCAT - RW sd rz49b-01 swapvol-01 0 0 598976 rz49b rz49b pl swapvol-02 swapvol ENABLED ACTIVE 598976 CONCAT - RW sd rz65b-01 swapvol-02 0 0 598976 rz65b rz65b v vol-rz49g fsgen ENABLED SYNC 4000000 SELECT - pl vol-rz49g-01 vol-rz49g ENABLED ACTIVE 4000000 CONCAT - RW sd rz49g-01 vol-rz49g-01 0 0 4000000 rz49g rz49g pl vol-rz49g-02 vol-rz49g ENABLED ACTIVE 4000000 CONCAT - RW sd rz65g-01 vol-rz49g-02 0 0 4000000 rz65g rz65g v vol-rz49h fsgen ENABLED NEEDSYNC 3480080 SELECT - pl vol-rz49h-01 vol-rz49h ENABLED ACTIVE 3480080 CONCAT - RW sd rz49h-01 vol-rz49h-01 0 0 3480080 rz49h rz49h pl vol-rz49h-02 vol-rz49h ENABLED ACTIVE 3480080 CONCAT - RW sd rz65h-01 vol-rz49h-02 0 0 3480080 rz65h rz65h sd9801# ps ax | grep vol 8 ?? I 0:00.50 vold -k -m boot 27 ?? I 0:00.01 /sbin/volrecover -b -o iosize=64k -s 40 ?? I 0:00.02 /etc/vol/type/fsgen/volume -U fsgen -g 862325304 .1025.sd9801 -o iosize 64k -- resync vol-rz49g 41 ?? U 0:08.09 /etc/vol/type/fsgen/volume -U fsgen -g 862325304 .1025.sd9801 -o iosize 64k -- resync vol-rz49g 576 ?? I 0:00.02 sh /usr/sbin/volwatch root 586 ?? I 0:00.01 sh /usr/sbin/volwatch root 587 ?? I 0:00.01 volnotify -f -w 15 667 console S + 0:00.01 grep vol sd9801# volstat -r sd9801# volstat -d OPERATIONS BLOCKS AVG TIME(ms) TYP NAME READ WRITE READ WRITE READ WRITE dm rz49a 0 0 0 0 0.0 0.0 dm rz49b 0 0 0 0 0.0 0.0 dm rz49d 0 0 0 0 0.0 0.0 dm rz49g 28 29 3584 3712 18.0 33.5 dm rz49h 0 0 0 0 0.0 0.0 dm rz65a 0 0 0 0 0.0 0.0 dm rz65b 0 0 0 0 0.0 0.0 dm rz65d 0 0 0 0 0.0 0.0 dm rz65g 29 29 3712 3712 17.9 34.7 dm rz65h 0 0 0 0 0.0 0.0 [Posted by WWW Notes gateway]
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
787.1 | all is OK ! | BRSDVP::DEVOS | Manu Devos NSIS Brussels 856-7539 | Tue Apr 29 1997 19:02 | 16 |
Hi Debbie, No, there is no new synch mechanism under the sun... When a system is crashing, no plex of a volume is known as OK. So, the synchronizing mechanism is simply opening the volume in Read/write back mode. It means that the volume is open as usual (round robin or prefered plex mode) and a whole volume read process is started which causes all read operations on one plex to be written back to the other plex. When the volume read process is finished, we are sure that the two plexes are the same. This is a standard procedure from the beginning of LSM. So, don't worry, be happy ! Manu. | |||||
787.2 | thanks :-) Can you answer another about PSL? | CSC32::TRENTA | Wed Apr 30 1997 12:29 | 28 | |
Manu, Thanks for the reply. So my perception of what happens during a crash has obviously been misinformed. I always thought writes were done to the 1st plex first then 2nd. So when a crash happened, it did a read of the 1st and then a write only to the 2nd. (This type of sync obviously only happens when one of the plexes becomes disabled or inactive for some reason). So then for clarification sake, are you saying that whatever plex it happens to read from that plex is assummed correct and then a write of that data is done back to itself and all other plexes? Makes sense - I just never realized that. Pardon the ignorance. I guess I just never really looked at the synching that was done after a crash before. Could you please explain another question I have then about "Persistent State Logging" ? What volumes does it know to synch upon reboot? Meaning I know that the log keeps a record of the first write and last close to a volume. So then I am assumming this means that if a volume was active/enabled but never written to (even though it was in a R/W state) that LSM knows that the volume does not have to be resynched after a crash. Am I right in my understanding of how Persistent State Logging works? Thanks again Manu. I appreciate the clarification. Debbie | |||||
787.3 | BRSDVP::DEVOS | Manu Devos NSIS Brussels 856-7539 | Thu May 01 1997 11:40 | 38 | |
Hi Debbie, >> So then for clarification sake, are you saying that whatever plex it >> happens to read from that plex is assummed correct and then a write >> of that data is done back to itself and all other plexes? This is true when a volume appears at LSM startup with both plexes in the "ACTIVE" state, which is typical when a system has crashed. It is also the case when you start for the first time a mirrored volume just created. You said above "... back to itself and all.."; no there is no write back to the plex just read, only to the other. As you noticed, this method is NOT applied when ONE plex appears as STALE at LSM startup. Obviously, it is placed in Write Only mode (WO), and the volume is open NORMALLY (i.e. not in rwback mode). Again a whole vvolume read process is started, causing the other plex(es) to be the source of data to copy to the the WO (stale) plex. A STALE plex is a plex which has not been updated during the lifetime of the volume because it has been intentionnaly detached or automatically detached due to an error on one of the sub-disk (or disk) underneath itself. >> Could you please explain another question I have then about >> "Persistent State Logging" ? What volumes does it know to synch >> upon reboot? Meaning I know that the log keeps a record of the >> first write and last close to a volume. So then I am assumming >> this means that if a volume was active/enabled but never written >> to (even though it was in a R/W state) that LSM knows that the >> volume does not have to be resynched after a crash. Am I >> right in my understanding of how Persistent State Logging works? You are right when you say that it knows if that volume should be resynchronized. But PSL is also used in other situations like BCL, and also to store various states of the DM disks, plexes and volumes. But, I am responding from home, and without any doc, so maybe Engineering could complete this answer. Regards, Manu. |