Discussion in 'Bukkit Help' started by Kane, Feb 9, 2011.

    OS: CentOS 5 2.6.18-194.32.1.el5 x86_64 x86_64 x86_64 GNU/Linux
    Java: JDK 7 X64 build b128
    Bukkit: git-Bukkit-0.0.0-379-g5e434da-b293 (MC: 1.2_01)
    Server Specs: i7 860 Quad Core with 16 gigs of memory.
    Launch: "/usr/lib/jan20/bin/java" -d64 -Xincgc -Xmx12G -jar craftbukkit-0.0.1-SNAPSHOT.jar nogui

    Comment: So I seam to have a memory leak that is unstoppable in bukkit? Just to give you an idea how powerful my machine could be in 1.1 I had both farm animals and monsters on and over 90 players online and it was set to pvp. I was able to keep the server running for 24 hours before I shut it down since it was a public test server. But it had always 50+ and hit 90 peek with the same specs same launch same everything. Accept I had allocated only 6GB into it...

    I also run everything on ramdrive so I/O is not a issue.

    2011-02-09 16:44:48 [INFO] [HEROCHAT] [g] GTA4rox08: Agentkid, I can get you mossy.
    2011-02-09 16:45:02 [INFO] [HEROCHAT] [g] confuzzledyma: gone for a while, see you guys later
    2011-02-09 16:45:02 [INFO] [HEROCHAT] [g] GTA4rox08: How much are you paying?
    2011-02-09 16:45:07 [INFO] [HEROCHAT] [g] KillerSammons: Sounas how do u wightlist on vint
    2011-02-09 16:45:16 [INFO] [HEROCHAT] [g] cXhristian: They are completly different. Only the idea is the same
    2011-02-09 16:45:16 [INFO] [HEROCHAT] [g] streetshine: HOW MUCH MOSSY U WANT?
    2011-02-09 16:45:21 [INFO] [HEROCHAT] [g] confuzzledyma: anyone need to buy a trade plot before I go?
    Whitelist: Player bamv9 is trying to join...allow!
    2011-02-09 16:45:33 [INFO] bamv9 [/xxxxxxxxxx:50942] logged in with entity id 54767
    Player count: 35
    2011-02-09 16:45:33 [INFO] pm0303 [/xxxxxxxxxxxxx.62:56274] lost connection
    2011-02-09 16:45:36 [INFO] powerj2 lost connection: disconnect.quitting
    2011-02-09 16:45:39 [INFO] Freed 0.01094818115234375 MB.
    2011-02-09 16:45:43 [INFO] [HEROCHAT] [g] streetshine: sorry caps Connection reset
            at net.minecraft.server.Packet.b(SourceFile:102)
            at net.minecraft.server.NetworkManager.f(SourceFile:157)
            at net.minecraft.server.NetworkManager.c(SourceFile:15)
    Exception in thread "Timer-1" Exception in thread "Connection #260 read thread" Exception in thread "Connection #232 read thread" Exception in thread "Connection #273 read thread" Exception in thread "Connection #29 read thread" Exception in thread "Connection #173 read thread" Exception in thread "Connection #266 read thread" Exception in thread "Connection #146 read thread" java.lang.OutOfMemoryError: Java heap space
    Exception in thread "Connection #207 read thread" Exception in thread "Connection #265 read thread" Exception in thread "Connection #255 read thread" java.lang.OutOfMemoryError: Java heap space
    java.lang.OutOfMemoryError: Java heap space
    2011-02-09 16:48:50 [INFO] Stopping server
    java.lang.OutOfMemoryError: Java heap space
    Noon (v1.2 by Feverdream) is off.
    Exception in thread "Connection #257 read thread" Exception in thread "Connection #34 read thread" Exception in thread "Connection #181 read thread" java.lang.OutOfMemoryError: Java heap space
    java.lang.OutOfMemoryError: Java heap space
    java.lang.OutOfMemoryError: Java heap space
    Goodbye world!
    HeroChat version 2.67 disabled.
    2011-02-09 16:50:47 [INFO] WorldEdit: Permissions plugin detected! Using Permissions plugin for permissions.
    2011-02-09 16:50:47 [INFO] WorldGuard: Permissions plugin detected! Using Permissions plugin for permissions.
    SignLift version 0.3 is disabled :(
    Goodbye world!
    2011-02-09 16:50:52 [INFO] Saving chunks
    [godcraft@196090 ramdisk]$ Goodbye world!
    bash: Goodbye: command not found
    [godcraft@196090 ramdisk]$ Goodbye world!
    Don't know whether this will help, but, go back to java 6 from the beta 7, second is this your machine or a vps, if its your machine switch to Ubuntu server 10.10, way more stable than centOS, imo. There have been stated issues with centOS and minecraft on other boards.
    Have you eliminated your plugins and tried without them? One of them is likely to be your problem.
    CentOS never been a issue with when I had 90+ and having to test out each plugin is a joke and not practical. They really need to do a better job detecting such bad plugins.
    This is bleeding edge software. I'm not sure what you expect here.
    well it's just it requires 30+ online to just find the bug :(
    There IS a memory leak in the software. Its even on the get satifaction.

    And for the record , blaming any one linux distribution is sort of a cop out if they all run the official java. I have never had issues on my centos server, except when bugs existed in Notch's code.
    I know its a lot of work. If you don't want to do it, you're going to get a bunch of replies just like mine (aka, check your plugins) or a bunch of replies saying "I don't have those problems." You're looking for a needle in a haystack, and all we can really do is advise of a way to search for that needle in that haystack.
    --- merged: Feb 10, 2011 4:49 AM ---
    True, but setting his max heap size at 12GB should allow him to run it for a very long time before that memory leak causes him to stop like that.
    The main problem is java developers usualy dont have the strong foundation in memory managment that C/C++ developers do; They usualy end up thinking that such leaks CANT happen.. and then when it does they act like "what? not my code!". Yet heck, even the best programmer gets them once in a while.

    Makes me wonder if I should start a contest or something to find memory leaks..
    I wasnt blaming centOS, I read a post somewhere on a forum that some vps's and distro's of centOS have modified code and that code conflicts with minecraft in some way, I don't run centOS so I didn't really pay attention to the whole problem, I was advising the OP to go to java 6 instead of 7 and try running ubuntu 10.10, if he doesn't want to thats up to him and his memory leak. And the memory leak should be caught by using the -Xincgc modifier in your start line. If that doesn't recycle your memory then there is something else probably wrong with the configuration of the OP's system.
    Here is what I uase for my centos based system. No support is provided, read the docs and understand what each one does is my best advice.. but this is what I use on my server and it seems to handle the memory leak in Minecraft better than without it. It should also be noted that this is configured for my system, your system may use a different config better.

    java -server -Xmx1024M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSIncrementalPacing -Xms1024M -XX:parallelGCThreads=2 -XX:+AggressiveOpts -jar craftbukkit-0.0.1-SNAPSHOT.jar nogui 2>&1
    Grum Bukkit Team Member

    Not reported on redmine, not using Java 1.5 or 1.6, not willing to test without certain plugins.

    What do you expect Bukkit to do about this? Our magic ball is currently under maintenance.
    @Grum, I'm willing to test on 1.5 or 1.6 and willing to test certain plugins. But most my players are quitting. They wanted 1.2 and it was to buggy and it seams that I'm the only one who can't run 30 players anymore.

    I will run a test server and try one by one. Issue now is I don't have enough players to test there all quitting and let alone want to play a test server.

    My other issue is how do I know how much usage a plugin is using or better yet even all of java. I pop up top and htop but I can't tell if it's using more then it's suppose to. It always looks right but yet my server is telling me different.

    I really would do anything but I think I need some guidance I'm not as experienced at bug finding as I was running a smooth server in the past.

    Hell it could be the Noon plugin all I know but this crash happens hours laters this was with 30+ on and everyone excited to play with 1.2. Now I doubt I can recreate that many players again.
    --- merged: Feb 10, 2011 8:41 AM ---
    I have now created a leaky bug report. I'm sorry if I came off like someone who does not give a shit or not willing to try. I do try I spend every day in irc and testing new things. I have no players left and this was a big event today that went into disaster. My remainder of my players begged to go back to 1.1 and so we did.

    Now I will be only able to test with a few people if that.
    I would appreciate it if you didn't make random, unproven, slanderousness comments about my plugin. I tried to help you by posting what I use.. and you start acting like that?
  15. Offline


    Wow you total took that out of concept lmao. I said it could be even the noon plugin. I said that as a sarcastic joke since it's a simple timer that I would assume just resets the position every few minutes.

    It was not even meant for harm in anyway possible....

    On a more serious note I been looking at this:
    Sarcasm does not work with me. I'm very literal.

    Setting the time literary only update an integer value. There is no memory leak in my code.
  17. Offline


    While, yes, I doubt your code is causing memory leaks, I had to post both quotes side by side as I laughed quite good at that.
  18. Offline


    I wrote a plugin that could get you more info.

    I haven't tested it other than running it twice. It has 1 parameter set in the chunkcheck.txt file that sets the period (in server ticks).

    Every 10 mins (default), it will print out the number of loaded chunks and the number in use.

    It considers any chunk with 176 blocks of a player as "in-use". It then gives you a percentage of chunks that are in use.

    Also, I haven't accounted for the spawn area, so that is 625 chunks that are always loaded, but don't count as in-use.

    In theory, it could be set to reap the not-in-use chunks, it is only a monitor at the moment.
    Thanks @Raphfrk It did not help well actually yes it did. Proved that the list that holds what chunks loaded and such seams to be desyncing not keeping up..

    So it helps to know it looked normal but clearly there not. One the things you notice for a long walk then stop command the world should take 2-3 secs at most on a avg good pc.. With 1 person online.. Now with 30 it could take 1-2 min.. And this is what it acts like after exploring a lot.

    So who knows @Tahg is working his ass off but not sure how to squish the bug yet.
  21. Offline


    I'm glad you find my humor enjoyable.
    @Dinnerbone @Grum I hope you can check this out when you get a minute.

    Okay here is an update. It seams that the leak is not plugged yet in 303.

    Now I'm now trying flying at 10x into brand new world (same world seed) and letting each one generate while auto flying and bouncing off the edge of the world while tries to load.

    Now I noticed doing this method did show more of a load over time with Vanilla but once you exit the client it slowly drops over time. This was NOT the case with bukkit. It dropped very little and stayed that way.

    With Bukkit in the end it's flat line then you see a small bump? I wanted to see if logging another client (diff) user would cause any special GC.. This was not the case at all and it was just even worse since when that client exit that footprint of memory stayed also.

    There is for sure still a massive issue here and unfortunately I don't see much progress yet. They been trying there hardest to fix it all day long but I just want to provide the 303 test and show you what happen.

    As you can see once I stopped moving and exit the Vanilla one went all the way back down all the way to 85mb of ram.... Where Bukkit went down to 912mb... I waited about 30min after the bukkit picture just hoping there was some slow process or GC but there was no luck.

    I really hope someone figures a fix because @Tahg and @Zenexer worked their asses off and no reward.

    Also if anyone would like to read the update source code that broke bukkit in the first place then check out bukkit 266 here it is on Github:
    Was the number of chunks increasing for you as you moved around?

    When I walked around, the number of chunks increased (unloading didn't seem to be happening), but I wasn't using a fly plugin to move, so didn't move very far.

    If the list isn't increasing, then it can't be used to reap loaded chunks.

    Btw, the key number is not in-use chunks, it is loaded chunks. The in-use chunks should be pretty constant, even if the loaded chunks increases.

    That is pretty weird. It suggests that the "save" list of loaded chunks does include all loaded chunks.
  24. Offline


    When you ran vanilla, did you run it with the exact same Java flags? If not, try craftbukkit with just the basic commands (-Xms1024M -Xmx 1024M) and see if its any different. Try avoiding -d64 and -Xincgc (just to rule them out as a potential problem).

    Btw, that's some awesome troubleshooting you're doing.
    Keep issues like these on Leaky, so we have one place to look for them please.
  26. Offline


    I have been everything I can I put there but lots of people offer good suggestions on here.

    I actually don't use any flags when testing just the default 1.5max heap size in windows. Though I will be testing java 6 since I been using java 7 though works fine for vanilla and 265 and before.
    hmm I will play around more though I'm getting tired of it the devs seam to be burned out on the issue. At this point it seams to be only me saying there is an issue now.
    You aren't the only one. I did an update yesterday of craftbukkit and all plugins and now I am getting java out of memory errors and server crashes. I'm not saying it's bukkits fault. I will do my own testing throughout the day by disabling plugins one by one.
  28. Offline


    Been getting this a lot lately too, usually after we go on a massive minecart journey and load up a lot of chunks. I'm only using 512 ram but it's been fine up until now. Started picking away at plugins, will report back if I figure anything out.
    I would like to thank everyone for helping fix this bug. So I don't know if we would have called this a memory leak now or if it's considered something else. Though the issue was extremely and simply a really bad error.

    What was the error? Chunks did not unload period. Now I thought it was something worse and as far as I know they found a few small bugs no matter what while doing this plus they fixed it to make chunks unload proper now. So everything is successful!!!!

    Here is an example of what happen with 1 person going down the subway system for about 4 minutes:


    Now I want to share the evidence of it being fixed though there is another error not related to this unloading of chunks that happens during creating chunks for the first time. I will show more of this later on.

    So here we go I want to show you what happens if you go North from spawn 8000 blocks on a minecart. Then warp back to spawn then go to south 8000 blocks in a minecart:


    See it looks like it's working better then expected now :)

    Now this is what happens if you speed hack at 10x on pre generated terrain:


    I did this for about 25 minutes and shows you clearly that it's working wonderful.

    Now lets try fresh Terrain Generation. At this point I only tested 1x movement but I think it shows its pretty stable at this point:


    Looks good and I did this for 45 minutes!

    So here is the only issue now that we have and Vanilla also has this issue though a forced GC fixes it where on Bukkit it does not solve it as efficient even though speed hacking is bad some people do it to our poor servers :(


    My suggestion to all admins is be like I am and be aggressive with client side hacks and if someone speed hacks and fly hacks ban them asap because there doing more damage then you expected!

    Please note the above image was server shutdown near the end!

    And just to show you how bad version 300 of Bukkit was because it did not unload chunks. This is using a pre generated world so no fresh generation using the subway system:

    And now this bug was fixed in I think 304 though I recommend everyone to use 304 since it was nice and stable now.

    So my suggestion skip 366 to 303 and go right to 304 and the bug has been FIXED!
    I'll try to get a hold of the error next time it happens, although most of the time I have to hard reboot my entire VPS. If it does it again though I'll try 304 and see if that makes it work (hopefully no plugins break cause of this).
