[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference decwet::networker

Title:	NetWorker
Notice:	kits - 12-14, problem reporting - 41.*, basics 1-100
Moderator:	DECWET::RANDALL.com::lenox

Created:	Thu Oct 10 1996
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	750
Total number of notes:	3361

105.0. "Recover a Networker backup (was note 7693 in the DIGITAL_UNIX conference)" by DECWET::TRESSEL (Pat Tressel) Thu Oct 31 1996 17:45

T.R	Title	User	Personal Name	Date	Lines
105.1		DECWET::ONO	The Wrong Stuff	`Thu Oct 31 1996 20:13`	22
105.2	Need to develope strategy for TruCluster recovery	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Mon Apr 07 1997 14:15`	21
	Here is a wrinkle: We have a TruCluster that is running ORACLE OPS 7.3.2.3. They only have tapes drives in the Exabyte 8mm libraries, and these libraries are only attached to the one node running NSR Server. Now the operator can use a library drive as a "single" drive, so it is should be possible to easily do a traditional level 0 backup of the system disk on this node. I believe we should actually be able to use a tape drive remotely to possible achieve a backup of the node that has no drive, or dump partitions to a file on a disk NFS served by the unit with the tapes. But I'm not sure how we could go about a restore of these level 0 dumps to bring back a TruCluster from a "disaster". NB ALL drives and partitions are LSM Mirrored for "safety". But I fear that this will add complexity to the restore process!
105.3		DECWET::FARLEE	Insufficient Virtual um...er....	`Tue Apr 08 1997 13:27`	16
	You'll have to fill in some gaps in your question here: First off, where does NetWorker fit in your scenario? Are you talking about using NetWorker for your "level 0 dumps"? If not, exactly what are you proposing? Secondly, we have produced a disaster recovery manual. If you consider the cluster node with the NetWorker server as the server node, and the other node(s) in the cluster as clients (which is how NetWorker thinks of them) then your problem is no different than any other disaster recovery involving several systems. Maybe you can state more clearly why you think that TruCluster makes the disaster recovery more complex? Kevin
105.4	Here the tricks ...	BACHUS::DEVOS	Manu Devos DEC/SI Brussels 856-7539	`Wed Apr 09 1997 00:47`	32
	Frank, If the problem is to take a "vdump -0" of the Cluster's systems without tapes (in anticipation of a disaster), then you can simply use this command. Let's say that SYSTEM-A has the tape and SYSTEM-B & SYSTEM-C have no tape device: SYSTEM-B # vdump -0 -f - / \| rsh SYSTEM-A "dd of=/dev/nrmt0h bs=60k" SYSTEM-C # vdump -0 -f - / \| rsh SYSTEM-A "dd of=/dev/nrmt0h bs=60k" The option "bs=60k" is needed to keep the vdump format (See vdump(8)). So, if SYSTEM-B is completely crashed, you can boot it from the CD, create the root and/or /usr disk device files, mount the root disk device on /mnt, and now the trick: # hostname SYSTEM-B # ifconfig tu0 x.x.x.x netmask y.y.y.y # echo "w.w.w.w SYSTEM-A" > /etc/hosts # rsh SYSTEM-A "dd if=/dev/nrmt0h bs=60k" \| vrestore -x -f - -D /mnt Replace X.X.X.X by the IP address of SYSTEM-B, Y.Y.Y.Y by its netmask and you should also create a /etc/hosts file which knows "SYSTEM-A" by the above "echo" command. Replace W.W.W.W by the IP address of SYSTEM-A Then, proceed similarly for /usr. You can now reboot the system and contact the NSR server to restore / and /usr to their last state... Easy, isn't it ? Manu.
105.5	This looks good! Ever thought of moving to Missouri?	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Wed Apr 09 1997 17:12`	18
	Mano, If you are saying that we can now get the network up after booting the UNIX CD in .4, then there is no problem! We knew how to take the appropriate vdumps, but wasn't sure of the best way to restore. We always had the last resort of moving "B-TAPELESS's" disks over to A for the restore, then putting them back on B, but want to avoid that. I did not catch the mandatory block size for vdump/vrestore before, thanks! Now to figure a way to model this on a one node system. Shouldn't be too bad... FJP
105.6	Pointer to Disaster Manual?	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Wed Apr 09 1997 17:18`	16
	Re: .3 Disaster manual? Is this the addendum that I have copied but not yet printed? Sorry if my questions cover stuff already documented. I have been spending energy to build up a UNIX 4.0b system with adequate resources to model stuff we'd like to recommend, and haven't had the time to review this addendum. Customer is pressing us with questions faster than we can get answers. If there is a different "Disaster Recovery" manual than the Addendum, please provide pointer. FJP
105.7		BRSSWS::DEVOS	Manu Devos DEC/SI Brussels 856-7539	`Thu Apr 10 1997 07:49`	11
	Frank, >> If you are saying that we can now get the network up after booting >> the UNIX CD in .4, then there is no problem! The procedure I gave in .4 is working. I personnaly used it on V3.2D and on V4.0B. Happy to help you, ... from Brussels :-) Manu.
105.8	Memory Channel available?	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Thu Apr 10 1997 20:57`	3
	I don't suppose booting from CD supports memory channel as the network? FJP
105.9	Cute, but needs tweaking.	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Thu Apr 10 1997 21:29`	15
	I know we are straying from NETWORKER specific a bit here. I have played with this and find that in trying to use vrestore, I need to use dd to read the tape. If I try to use vrestore -i -f tape to inspect the backup, I get a core dump. If I use dd if=(tape) bs=60K \| vrestore -i -f - to look at the tape, it is fine. I suspect that vrestore and dd do not agree on the meaning of 60k, or that because the initial vdump went to "-" instead of tape, vdump did not block the data at 60K. Specifically vrestore -i -f (tape) bitches that it only got "60k" (61440) when it was looking for 64K (65536). It's time for a nap. I'll check this tomorrow.
105.10	We must use bs=64k	BACHUS::DEVOS	Manu Devos DEC/SI Brussels 856-7539	`Fri Apr 11 1997 03:52`	14
	Hello Frank, You're right... I composed the note .4 from my mind, and didn't remember exactly the block size to use. So, I quickly checked the vdump(8) man page to find it, and typed 60K. Today, I made the test in real situation and it appears that vrestore wants to read 64K blocks. So I change the bs=64K at the vdump and vrestore time, and all is OK. Anyway, I am confuse, because I am quite sure that I had previously used the bs=60K! (Maybe, it was on an older version, today I made the test with V4.0B). Regards, Manu.
105.11		DECWET::RWALKER	Roger Walker - Media Changers	`Fri Apr 11 1997 08:09`	7
	r.e. last few I'd like to thank you guys for these last few replies, they help us understand alternatives for quick disaster recovery. Since it takes a lot to get a system up enough to run a NetWorker recovery we like to hear about every way to make this happen faster for the customer. It really adds to the whole package.
105.12		DECWET::FARLEE	Insufficient Virtual um...er....	`Fri Apr 11 1997 09:23`	22
	I've been mulling a few ideas around, and I'd like to get feedback from you folks on how much demand you see: 1) NetWorker recovery from RIS boot... You can set up a system as a RIS server. One of the options when booting RIS is to choose the "system management" option on the menu. If you put srecover (a statically linked version of recover) into the directory where these functions live, you can use it to recover all of your disks from a NetWorker server without ever installing the OS. This is not as simple a process as it sounds, and there are catches, but it is possible. Would it be a desirable feature if we were to document the setup, and write scripts which would help to automate the tricky bits? 2) NetWorker recovery from bootable tape/cd This would be either a bootable CD with a script which would prompt you for info (node name/address, server name/address, routing info, etc.), and then run recover based on that. Thoughts? Feedback?
105.13		KITCHE::schott	Eric R. Schott USG Product Management	`Fri Apr 11 1997 11:39`	10
	Does the procedure a few notes back work with LSM mirrored /, /usr, /var? I expect you have to change a few things after the restore to turn off lsm long enough to re-encapsulate... Also, appropriate command when init'ing disks to ensure they boot...
105.14	Time to get serious!	USPS::FPRUSS	Frank Pruss, 202-232-7347	`Fri Apr 11 1997 17:18`	26
	LSM Note 643 has a lot of hints (and even details) on how to go about recovering an LSM configuration. I am asking these questions in preparation for getting my customer (really a DIGITAL consultant working with the customer) ready to develope a detailed distaster recovery plan for a UNIX TruCluster/ASE supporting ORACLE OPS. I expect this to be taken to the level that the system can be rebuilt from "new, un-used parts" at a different site. Similar to the work Ron Ginger has been doing for his customer as described in the LSM conference. In 18 months, it is conceivable that this system will be hosting 0.5 to 1.5 Tb of ORACLE data, including all development, training and QA instances. (Right now it is only capable of 100 to 180 Gb, depending on whether they stay at LSM Mirrors or move to controller RAID-5) I expect the development of the plan to be performed by DIGITAL NSIS, and intend that it be delivered in the form of a WORD/PDF document, including any scripts to save and restore configuration data. Hopefully, when this is all done, it will be an "interesting" document to share internally. I don't have the equipment here in the home office to play with all aspects of the technical requirements _yet_. But I will be scrounging around!
105.15	Reply to .13 and .12	BACHUS::DEVOS	Manu Devos NSIS Brussels 856-7539	`Sat Apr 12 1997 09:24`	41
	R: .13 Hi Eric, Yes, the procedure should be adapted for LSM/ADVFS setup on the system disk, but my proposal was only aimed to show that it is NOT needed to re-install a whole UNIX from CD. We can use the network from the SAS Unix with only the following three commands: # hostname SYSTEM_NAME # ifconfig net_device address netmask # create a one line hosts file And then your externally saved data (tape-disk-NSR) are accessible. -------------------------------------------------------------------------- R: .12 I think that the two proposals are good for us. The first for the big sites and the DEC sites, and the second more specifically for the ordinary customers. But, the most interesting development in this area could be an LSM program which would "read" the LSM PRIVATE AREA of a rootdg disk and save the info in a file. That file could then be interpreted by a second LSM program at the recovery time to AUTOMATICALLY re-create the LSM private area on the new disk. You will say that volsave/volrestore is already existing, but they are NOT working for the system disk. ANd the main problem of a crash recovery is on the system disk. So, I am preaching for that. LSM has already provided program like "volprivutil dumpconfig" so, the contrary should be possible. Any taker in the LSM group? So, to summarize my point of view, again I think that we need a better integration of NSR-LSM-ADVFS-ASE. Regards, Manu.