Pages: 1 2 3 [4] 5 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 6 post(s) |
Yur mom
|
Posted - 2008.10.30 03:44:00 -
[91]
Originally by: Triffon
Originally by: AdioHyperion
Originally by: Generic Alt
Originally by: Ghost Goat guess the old "lets reboot the cluster and hope it works" didnt work
i was soooo close to change a skill ,give us the damn que allready .
I was literally 5 mins away from skill train completion when I got booted.
I agree... give us the skill queue, please!
+1
+1
+.86 (stacking penalty)
|
|
CCP Valar
|
Posted - 2008.10.30 03:45:00 -
[92]
This is the real deal. Server will be up in a few minutes, possibly a bit laggy while the DB server cache is rebuilt.
I'll post a little post-mortem here in a few. ---- Virtual World Database Administrator Operations department CCP Games |
|
Nacho Muchos
|
Posted - 2008.10.30 03:45:00 -
[93]
Originally by: AdioHyperion
Originally by: Generic Alt
Originally by: Ghost Goat guess the old "lets reboot the cluster and hope it works" didnt work
i was soooo close to change a skill ,give us the damn que allready .
I was literally 5 mins away from skill train completion when I got booted.
I agree... give us the skill queue, please!
+1
I was 15 minutes from finishing a skill, and I was warping into a mission when it went down. I sure hope my ship is still there when I login, and yes we definitely need some sort of skill queue for things like this, even if it is only one skill in advance. I was just about to give up and go to sleep but that would cost me another 6 or so hours of skill training, which is pretty crucial when you are training a new character. >.>
|
Ghost Goat
|
Posted - 2008.10.30 03:45:00 -
[94]
Originally by: Generic Alt
Originally by: Ghost Goat guess the old "lets reboot the cluster and hope it works" didnt work
i was soooo close to change a skill ,give us the damn que allready .
I was literally 5 mins away from skill train completion when I got booted.
I agree... give us the skill queue, please!
5 mins ! man i was hoovering over the skill i wanted to switch to,clicked it and was about to go to sleep when i noticed it didnt changed and a sec later i got disconnected .
|
AdioHyperion
Caldari One Stop Mining Shop
|
Posted - 2008.10.30 03:45:00 -
[95]
Originally by: CCP Valar I'll post a little post-mortem here in a few.
Do what?
|
Nik W
|
Posted - 2008.10.30 03:45:00 -
[96]
Originally by: Triffon
Originally by: AdioHyperion
Originally by: Generic Alt
Originally by: Ghost Goat guess the old "lets reboot the cluster and hope it works" didnt work
i was soooo close to change a skill ,give us the damn que allready .
I was literally 5 mins away from skill train completion when I got booted.
I agree... give us the skill queue, please!
+1
+1
+1
|
commander tennder
|
Posted - 2008.10.30 03:45:00 -
[97]
5 sec
|
Zach Forrester
|
Posted - 2008.10.30 03:46:00 -
[98]
Originally by: CCP Valar This is the real deal. Server will be up in a few minutes, possibly a bit laggy while the DB server cache is rebuilt.
I'll post a little post-mortem here in a few.
Again, I apologise if I had anything to do with this. XD I swear I'm paranoid now. A hundred angry mercenaries will come to pod me, I'm sure of it. *panicpanicpanic*
|
Otho Underhill
|
Posted - 2008.10.30 03:46:00 -
[99]
In
|
Super Skulls
Twin Sun Syndicate
|
Posted - 2008.10.30 03:46:00 -
[100]
zomg it worx!!!!
|
|
Tukaa
Amarr Burning Sky Labs
|
Posted - 2008.10.30 03:46:00 -
[101]
ONLINE
|
Mankirks Wife
|
Posted - 2008.10.30 03:46:00 -
[102]
Originally by: Yur mom
+.86 (stacking penalty)
Due to balance issues, we have seen fit to nerf pyramid quotes. ---
|
AdioHyperion
Caldari One Stop Mining Shop
|
Posted - 2008.10.30 03:47:00 -
[103]
So far so good....
|
Cygnus DivumExuro
Gallente
|
Posted - 2008.10.30 03:47:00 -
[104]
bet folks are not thinking those IBM Blade Servers are not as cool as first thought!
|
Taius Pax
|
Posted - 2008.10.30 03:47:00 -
[105]
woot! it's back!
|
sp009
Caldari Organization Outcast
|
Posted - 2008.10.30 03:47:00 -
[106]
The Servers up everyone togather log in n see if it is really fix'd
|
Taius Pax
|
Posted - 2008.10.30 03:48:00 -
[107]
Originally by: Cygnus DivumExuro bet folks are not thinking those IBM Blade Servers are not as cool as first thought!
was that double negative intentional?
|
Noriko Sakai
Gallente DC1 Coalition
|
Posted - 2008.10.30 03:48:00 -
[108]
Server up ;-)
|
Saardinen
|
Posted - 2008.10.30 03:49:00 -
[109]
I'm just waiting for the postmortem. *nailbites*
|
Dinsdale Pirannha
Gallente
|
Posted - 2008.10.30 03:51:00 -
[110]
I have worked for a large company (think 3 letters) and we hosted many, many customers' web sites. Very rarely, we would have outages. All hell would break loose, even when we did not own the software/hardware/network.
In the case of the problem being traced back to something my company did, somebody would get fired. It is simply incomprehensible to me that ANY business would mess with their active database during regular business hours (that is the 23 hours/day in this case). And trust me, the only way CCP had a failure was if something WAS CHANGED.
You want to make changes that can't be done in the one hour you have allocated every day, you SCHEDULE A CHANGE WINDOW. You give everyone at least a week's heads up, and you bring down the server at a time which impacts the fewest people.
But this ad hoc outages are ridiculous, and would never happen if CCP actually followed some kind of discipline that every other company follows where its customers need to access information online.
Can you imagine what would happen if your bank decided to upgrade their database at 2:00 pm on a Wednesday? BTW, it is standard practice for banks, and most large firms, to schedule network outages from 12:01 am to 4:00 am on Sundays. That of course would not work for CCP, given the nature of their business, but there are times where impact would be minimized on their customer base.
I have been playing Eve for 6 months and am stunned at the lack or professionalism of their technical staff. The CIO should be fired.
|
|
Elaron
Jericho Fraction The Star Fraction
|
Posted - 2008.10.30 03:56:00 -
[111]
Originally by: Dinsdale Pirannha <snipped a rant>
How about you wait for the post mortem before you get on your high horse? By past experience Valar & co are pretty good at explaining the causes of these unplanned outages.
|
|
CCP Mitnal
C C P
|
Posted - 2008.10.30 03:57:00 -
[112]
Originally by: AdioHyperion
Originally by: CCP Valar I'll post a little post-mortem here in a few.
Do what?
Valar will explain what happened
I believe it will state that the servers were not at fault for those knocking them.
Mitnal Community Representative CCP Games, EVE Online Email / Netfang |
|
Xaniff
|
Posted - 2008.10.30 03:58:00 -
[113]
Originally by: Dinsdale Pirannha
In the case of the problem being traced back to something my company did, somebody would get fired. It is simply incomprehensible to me that ANY business would mess with their active database during regular business hours (that is the 23 hours/day in this case). And trust me, the only way CCP had a failure was if something WAS CHANGED.
You want to make changes that can't be done in the one hour you have allocated every day, you SCHEDULE A CHANGE WINDOW. You give everyone at least a week's heads up, and you bring down the server at a time which impacts the fewest people.
But this ad hoc outages are ridiculous, and would never happen if CCP actually followed some kind of discipline that every other company follows where its customers need to access information online.
I have been playing Eve for 6 months and am stunned at the lack or professionalism of their technical staff. The CIO should be fired.
I think the problem has been more of a hardware issue than software. And there was that problem with a local ISP yesterday which may have contributed.
|
Zach Forrester
|
Posted - 2008.10.30 03:59:00 -
[114]
Originally by: CCP Mitnal
Originally by: AdioHyperion
Originally by: CCP Valar I'll post a little post-mortem here in a few.
Do what?
Valar will explain what happened
I believe it will state that the servers were not at fault for those knocking them.
XD I'm gonna die, I know it. I honestly didn't mean it, whatever I did.
|
Ghost Goat
|
Posted - 2008.10.30 03:59:00 -
[115]
Originally by: Dinsdale Pirannha I have worked for a large company (think 3 letters) and we hosted many, many customers' web sites. Very rarely, we would have outages. All hell would break loose, even when we did not own the software/hardware/network.
In the case of the problem being traced back to something my company did, somebody would get fired. It is simply incomprehensible to me that ANY business would mess with their active database during regular business hours (that is the 23 hours/day in this case). And trust me, the only way CCP had a failure was if something WAS CHANGED.
You want to make changes that can't be done in the one hour you have allocated every day, you SCHEDULE A CHANGE WINDOW. You give everyone at least a week's heads up, and you bring down the server at a time which impacts the fewest people.
But this ad hoc outages are ridiculous, and would never happen if CCP actually followed some kind of discipline that every other company follows where its customers need to access information online.
Can you imagine what would happen if your bank decided to upgrade their database at 2:00 pm on a Wednesday? BTW, it is standard practice for banks, and most large firms, to schedule network outages from 12:01 am to 4:00 am on Sundays. That of course would not work for CCP, given the nature of their business, but there are times where impact would be minimized on their customer base.
I have been playing Eve for 6 months and am stunned at the lack or professionalism of their technical staff. The CIO should be fired.
well you can say that , or you can say wooooooooooooooo omgomgogmg the server is up !11!11!1111 /me do the happy dance ,after realizing what i just did go hide in the corner in shame .
skill changed i can go to sleep a happy man now ...
but we still need the damn skill que .
ghost training is a goner , no excuses now .
|
|
CCP Valar
|
Posted - 2008.10.30 04:01:00 -
[116]
With my outfit still smoking a bit, fresh from some firefighting, I bring you... THE POST MORTEM
The server crash tonight was a result of our attempts to prevent a server crash earlier today. Around 21:20 this evening, we had an automatic alert go off, warning us that a RAMSAN was critically low on disk space. In an attempt to fix this, we shrank the data file on it and started an index defrag, to free up space in the datafile.... crisis averted... or so we thought.
At 2 AM, a full backup of the database started, but the index defragmentation of the biggest, most critical table in the database was still underway. While a full backup of the database is being performed, the transaction log is not truncated on transaction log backups and with the increased activity that comes with the index defrag, the transaction log quickly grew to fill up both RAMSANs... This is what lead to the server crash.
When I got a phone call from the on-call person, the server was already down. I proceeded to do a fail-over of the database server, shrink the transaction log files and the datafiles on the RAMSANs. When I had done this, I attempted to start the server, but it took 3 startup attempts before nodes started registering themselves in the database on time, likely due to the "warming up" the database has to do after a failover.
I'm truly sorry for the inconvenience this caused you and hope you can enjoy playing for the rest of the night. ---- Virtual World Database Administrator Operations department CCP Games |
|
Pr1ncess Alia
Caldari Perkone
|
Posted - 2008.10.30 04:02:00 -
[117]
(scene: front of an office building. press conference setting.)
front door opens and late middle aged man dressed in a business suit walks out
CEO: *ahem* I just crapped my pants.
crowd mumbles and whispers. a few scratch their heads.
CEO: That is all for now, thank you. We won't be taking questions at this time. Feel free to talk amongst yourselves.
flash bulbs go off. Business man walks back into the office building.
i kid i kid! Thanks for gettin er back up and running
|
Zach Forrester
|
Posted - 2008.10.30 04:03:00 -
[118]
Originally by: CCP Valar With my outfit still smoking a bit, fresh from some firefighting, I bring you... THE POST MORTEM
The server crash tonight was a result of our attempts to prevent a server crash earlier today. Around 21:20 this evening, we had an automatic alert go off, warning us that a RAMSAN was critically low on disk space. In an attempt to fix this, we shrank the data file on it and started an index defrag, to free up space in the datafile.... crisis averted... or so we thought.
At 2 AM, a full backup of the database started, but the index defragmentation of the biggest, most critical table in the database was still underway. While a full backup of the database is being performed, the transaction log is not truncated on transaction log backups and with the increased activity that comes with the index defrag, the transaction log quickly grew to fill up both RAMSANs... This is what lead to the server crash.
When I got a phone call from the on-call person, the server was already down. I proceeded to do a fail-over of the database server, shrink the transaction log files and the datafiles on the RAMSANs. When I had done this, I attempted to start the server, but it took 3 startup attempts before nodes started registering themselves in the database on time, likely due to the "warming up" the database has to do after a failover.
I'm truly sorry for the inconvenience this caused you and hope you can enjoy playing for the rest of the night.
Okay, so it wasn't my fault. ^^ *walks away a free man* XD I swear I thought I was gonna be hunted down there.
|
Syberbolt8
Gallente The Scope
|
Posted - 2008.10.30 04:05:00 -
[119]
Originally by: Zach Forrester
Originally by: CCP Valar With my outfit still smoking a bit, fresh from some firefighting, I bring you... THE POST MORTEM
The server crash tonight was a result of our attempts to prevent a server crash earlier today. Around 21:20 this evening, we had an automatic alert go off, warning us that a RAMSAN was critically low on disk space. In an attempt to fix this, we shrank the data file on it and started an index defrag, to free up space in the datafile.... crisis averted... or so we thought.
At 2 AM, a full backup of the database started, but the index defragmentation of the biggest, most critical table in the database was still underway. While a full backup of the database is being performed, the transaction log is not truncated on transaction log backups and with the increased activity that comes with the index defrag, the transaction log quickly grew to fill up both RAMSANs... This is what lead to the server crash.
When I got a phone call from the on-call person, the server was already down. I proceeded to do a fail-over of the database server, shrink the transaction log files and the datafiles on the RAMSANs. When I had done this, I attempted to start the server, but it took 3 startup attempts before nodes started registering themselves in the database on time, likely due to the "warming up" the database has to do after a failover.
I'm truly sorry for the inconvenience this caused you and hope you can enjoy playing for the rest of the night.
Okay, so it wasn't my fault. ^^ *walks away a free man* XD I swear I thought I was gonna be hunted down there.
We can hunt you down anyway if you like... shouldn't be much of an issue. Im Looking for a new Home |
Stuart Bruegel
Gallente
|
Posted - 2008.10.30 04:09:00 -
[120]
Originally by: CCP Valar With my outfit still smoking a bit, fresh from some firefighting, I bring you... THE POST MORTEM
<snip>
Gotta love backups. I swear they break things far more often than you have to use them.
Thanks for the detailed post-mortem! Those of us in the business appreciate it.
Go get some sleep. :-)
|
|
|
|
|
Pages: 1 2 3 [4] 5 :: one page |
First page | Previous page | Next page | Last page |