RAC is not for UNBREAKABLE database; it is for _improved availiability and maintainability_. RAC have a high risk of whole database failure in many different scenarios.
But it allows to plan maintanance without turning databse off, protects you from single server failures (in some scenarios ONLY) and so on.
If you need HA - use data guard or something else - RAC database is not HA database. Just by design (too many information is shared, + clusterware is weak for now and do not garantee detecting all deadlock situations).
In short - if we compare RAC vs Non RAC: - probability fo system failure is almost the same; - average non-acces time/month is much better in RAC; - the chance for long failure is almost the same (even higher) in RAC. The chance of shoirt failure (system reboot, you apply patch and so on) is much lower in RAC.
-- -- Original Message -- -- From: "Shreehari Desikan" <shree@(protected)> To: "Kevin Closson" <kevinc@(protected)> Cc: <suse-oracle@(protected)> Sent: Thursday, September 15, 2005 6:07 PM Subject: Re: [suse-oracle] Shutting down one instance shuts down the whole RAC database.
> thanks a ton for this piece of info. > it worked finally. I really appreciate this. > > hey, btw , what made you say that the stack is Unbreakable :-) > Can't understand it, though. > > thanks > Shree. > > Kevin Closson wrote: > > > > >applying that patch is only half of it. Have > >you set filesystemio_options=directIO > >on both instances ? > > > >BTW, it looks like your whole stack is Unbreakable :-) > > > >I think you should ask Oracle if OCMS on 9.2.0.6 > >does Direct IO. Probably doesn't. I know for a fact > >that it doesn't in 10.1.0.4...strace proves that: > > > > > >[pid 6155] open("/u02/crsdata/ocr10g.dbf", > >O_RDONLY|O_SYNC|O_LARGEFILE) = 10 [pid 6155] > >open("/u02/crsdata/ocr10g.dbf", > >O_RDONLY|O_SYNC|O_LARGEFILE) = 10 [pid 6155] > >open("/u02/crsdata/ocr10g.dbf", > >O_RDONLY|O_SYNC|O_LARGEFILE) = 10 [pid 6155] > >open("/u02/crsdata/ocr10g.dbf", > >O_RDONLY|O_SYNC|O_LARGEFILE) = 10 [pid 6167] > >O_RDONLY|O_SYNC|O_LARGEFILE) = 17 [pid 6169] > >. > >. > >. > >[pid 6155] open("/u02/crsdata/ocr10g.dbf", > >O_RDONLY|O_SYNC|O_LARGEFILE) = 10 [pid 6155] read(10, > >"ocrconfig_loc=/u02/crsdata/ocr10"..., > > > >I'm flippant about the fact that Oracle clusterware > >doesn't do O_DIRECT opens on NAS, because I don't > >care. If you use PolyServe-based NAS (such as the > >HP EFS Clustered Gateway), Oracle clusterware > >gets O_DIRECT whether it wants it or not it does > >an O_DIRECT open(2) call. It is a mount option > >in our NAS offering. > > > > > > > > > > > > > > > > > > > >[pid 6155] open("/u02/crsdata/ocr10g.dbf",O_RDONLY|O_SYNC|O_LARGEFILE) > >= 10 [pid 6155] > > > > > > > > > > > > > > > >>>>-- --Original Message-- -- > >>>>From: Bennett Leve [mailto:bennett.leve@(protected)] > >>>>Sent: Friday, September 15, 2006 2:42 PM > >>>>To: Shreehari Desikan > >>>>Cc: suse-oracle@(protected) > >>>>Subject: Re: [suse-oracle] Shutting down one instance shuts > >>>>down the whole RAC database. > >>>> > >>>>Shreehari, > >>>> > >>>>What are your NFS mount options on the servers? This > >>>>typically is caused by stale read. > >>>> > >>>>-Bennett > >>>> > >>>>Shreehari Desikan wrote: > >>>> > >>>> > >>>> > >>>>>Hi All > >>>>>I am running SLES 9 SP2 Linux on a 2 node RAC cluster > >>>>> > >>>>> > >>>>configuration, > >>>> > >>>> > >>>>>with NFS mounted Netapp filer and Oracle 9.2.0.6 RAC and 9.2.0.6 > >>>>>Cluster Manager The problem is when I shut down the > >>>>> > >>>>> > >>>>instance on any > >>>> > >>>> > >>>>>one of the nodes, the entire database comes to a halt. The > >>>>> > >>>>> > >>>>node that I > >>>> > >>>> > >>>>>was expecting to be running has the following messgaes in the > >>>>>Alert_sid.log file > >>>>> > >>>>>Errors in file > >>>>> > >>>>> > >>>>/var/opt/oracle/admin/.../bdump/gmdb2_smon_18637.trc: > >>>> > >>>> > >>>>>ORA-00600 (See ORA-00600.ora-code.com): internal error code, arguments: [kclchkblk_3], [0], > >>>>>[733768], [14], [], [], [], [] Fatal internal error happened while > >>>>>SMON was doing instance transaction recovery. > >>>>>Thu Sep 15 01:17:38 2005 > >>>>>Errors in file > >>>>> > >>>>> > >>>>/var/opt/oracle/admin/.../bdump/gmdb2_smon_18637.trc: > >>>> > >>>> > >>>>>ORA-00600 (See ORA-00600.ora-code.com): internal error code, arguments: [kclchkblk_3], [0], > >>>>>[733768], [14], [], [], [], [] > >>>>>SMON: terminating instance due to error 600 Thu Sep 15 > >>>>> > >>>>> > >>>>01:17:38 2005 > >>>> > >>>> > >>>>>Trace dumping is performing id=[cdmp_20050915011738] Thu Sep 15 > >>>>>01:17:38 2005 Dump system state for local instance only Thu Sep 15 > >>>>>01:17:38 2005 Trace dumping is performing > >>>>> > >>>>> > >>>>id=[cdmp_20050915011738] Thu > >>>> > >>>> > >>>>>Sep 15 01:17:43 2005 Instance terminated by SMON, pid = 18637 > >>>>> > >>>>>Oracle support suggested a patch "" 2448994 "DIRECT IO > >>>>> > >>>>> > >>>>SUPPORT OVER > >>>> > >>>> > >>>>>NFS". ""...however that still did not fix the issue. > >>>>> > >>>>>Has anyone come across this situation and fixed it?? Any help is > >>>>>definitely much appreciated. > >>>>> > >>>>> > >>>>>Regards, > >>>>>Shree. > >>>>> > >>>>> > >>>>> > >>>>-- > >>>>To unsubscribe, email: suse-oracle-unsubscribe@(protected) For > >>>>additional commands, email: suse-oracle-help@(protected) Please > >>>>see http://www.suse.com/oracle/ before posting > >>>> > >>>> > >>>> > >>>> > > > > > > > > > -- > To unsubscribe, email: suse-oracle-unsubscribe@(protected) > For additional commands, email: suse-oracle-help@(protected) > Please see http://www.suse.com/oracle/ before posting > >
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting