MongoDB – file size is huge and growing

mongodb

I have an application that use mongo for storing short living data. All data older than 45 minutes is removed by script something like:

oldSearches = [list of old searches]
connection = Connection()
db = connection.searchDB
res = db.results.remove{'search_id':{"$in":oldSearches}})

I've checked current status –

>db.results.stats()
{
        "ns" : "searchDB.results",
        "count" : 2865,
        "size" : 1003859656,
        "storageSize" : 29315124464,
        "nindexes" : 1,
        "ok" : 1
}

So, according to this 1gb of data occupies 29GB of storage. Data folder looks like this(You may see that many files are very old – last accessed in the middle of may):

ls -l /var/lib/mongodb/
total 31506556
-rwxr-xr-x 1 mongodb nogroup          6 2011-06-05 18:28 mongod.lock
-rw------- 1 mongodb nogroup   67108864 2011-05-13 17:45 searchDB.0
-rw------- 1 mongodb nogroup  134217728 2011-05-13 14:45 searchDB.1
-rw------- 1 mongodb nogroup 2146435072 2011-05-20 20:45 searchDB.10
-rw------- 1 mongodb nogroup 2146435072 2011-05-28 00:00 searchDB.11
-rw------- 1 mongodb nogroup 2146435072 2011-05-27 13:45 searchDB.12
-rw------- 1 mongodb nogroup 2146435072 2011-05-29 16:45 searchDB.13
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.14
-rw------- 1 mongodb nogroup 2146435072 2011-06-06 01:45 searchDB.15
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.16
-rw------- 1 mongodb nogroup 2146435072 2011-06-07 13:50 searchDB.17
-rw------- 1 mongodb nogroup 2146435072 2011-06-06 09:07 searchDB.18
-rw------- 1 mongodb nogroup  268435456 2011-05-13 14:45 searchDB.2
-rw------- 1 mongodb nogroup  536870912 2011-05-11 00:45 searchDB.3
-rw------- 1 mongodb nogroup 1073741824 2011-05-29 23:37 searchDB.4
-rw------- 1 mongodb nogroup 2146435072 2011-05-13 17:45 searchDB.5
-rw------- 1 mongodb nogroup 2146435072 2011-05-18 17:45 searchDB.6
-rw------- 1 mongodb nogroup 2146435072 2011-05-16 01:45 searchDB.7
-rw------- 1 mongodb nogroup 2146435072 2011-05-17 13:45 searchDB.8
-rw------- 1 mongodb nogroup 2146435072 2011-05-23 16:45 searchDB.9
-rw------- 1 mongodb nogroup   16777216 2011-06-07 13:50 searchDB.ns
-rw------- 1 mongodb nogroup   67108864 2011-04-23 18:51 test.0
-rw------- 1 mongodb nogroup   16777216 2011-04-23 18:51 test.ns

According to "top" mongod uses 29G of virtual memory ( and 780Mb of RSS)

Why do i have such abnormal values? Do i need to run something additional to .remove() function to clean up database from old values?

Best Solution

Virtual memory size and resident size will appear to be very large for the mongod process. This is benign: virtual memory space will be just larger than the size of the datafiles open and mapped; resident size will vary depending on the amount of memory not used by other processes on the machine.

http://www.mongodb.org/display/DOCS/Caching

When you remove an object from MongoDB collection, the space it occupied is not automatically garbage collected and new records are only appended to the end of data files, making them grow bigger and bigger. This explains it all:

http://www.mongodb.org/display/DOCS/Excessive+Disk+Space

For starters, simply use:

db.repairDatabase()