PDA

View Full Version : Database file size



adrian
11-03-2013, 04:36 AM
Hello,
I'm using Mezzmo for a few months now, mainly because it was one of the few apps able to stream external SRT files to playback devices. However I have a huge complaint to file for the developers of this UPnP server: the database file that it creates and uses, Mezzmo.db, grows to unhuman proprotions uselessly.
Let me illustrate what I'm trying to say:
With a library of just 400 video files, the library is taking up 3.5 GIGAbytes. That's around 3500 megabytes of space, for just a few hundred movies. Now imagine having 3000 movies, and maybe some 20.000 songs. I would need to have a dedicated HDD just for the database file Mezzmo is using, it would easily reach 1 TB or more.

Now, being a programmer myself, I decided to take some actions about this ever since I was suspicious when it hit 1 GB in size for just 50 media files. So I did what any developer would do: started digging through the database using a SQLite database manager, so here's what I immediately detected:

- THUMBNAILS are being cloned into the thumbnails table for each playlist that contains a file for that thumbnail, and also for any other file that has the same thumbnail. So, for example, if we have movie which has a thumbnail image that takes 300 kilobytes, and that movie is the thumbnail of 30 playlists (including automatically generated playlists), then it will eat 10 MB of space, instead of just 300 kB. This takes like 95% of the DB space.
- ALL playlists are being stored in the database. This includes automatic playlists. This is stupid, since those are dynamic and should be created on-the-fly, in-memory, at runtime. Also these consume thumbnail storage since they require thumbnails...
- dead entries for a lot of stuff, including playlists. I observed this in the GUI too: when I delete a playlist, I sometimes end-up with ghost entries, or sub-items that no longer have a node parent.

So, what I did after this: wrote myself a nice PHP script to "clean" this database in the following way:

- redundant thumbnails are being completely deleted and only the unique ones are kept (since they have a hash anyway, this is a fast operation). Ofcourse, deleted items referenced by other files and playlists will get the ID of the one and only kept thumbnail's id;
- deleted thumbnails of inexistent or dead entries (as I said, this is a bug in Mezzmo, ghost entries remain many times after deleting stuff)
- removed oprhaned playlists (there were A LOT of those!). Recursively doing this until no entries with inexistent parents remain.
- vacuum - to compact the database file

Final results: after running this cleanup script, my DB file shrank from 3.5 GB to 67 MB. That's 50 times smaller in size, with absolutely the same DB content.

Unofrtunately, I have to run this script after each time Mezzmo starts or I add any new files, because on startup/adding files, it WILL re-duplicate all those thumbnails for the playlist entries again and again, ending up back to gigabytes in size.

So, my request to your DEV team is: please, please, PLEASE, figure this out for your next release. My system drive is not a black-hole storage space for the Mezzmo server. DO NOT DUPLICATE thumbnails over and over again, DO NOT keep in the DB thousands and thousands of dynamic playlists (they're called dynamic for a reason), FIX those dead entries that remain after editing the library, and you'll have happier users.

Thanks.

Paul
11-03-2013, 10:47 AM
Many thanks for the detailed post, Adrian. We've addressed several of these issues in the next release. Email us at support [at] conceiva [dot] com and please send us your script. We'll check that our fixes matches what you are doing.

JohnM
11-04-2013, 03:39 AM
I feel cheated. My db is only 995KB and I have 1958 video files and about 20000 songs. It is like I am not getting the bang for my buck.

Oh wait, it might just be the way I have my system set up. Never mind you guys are doing great keep up the good work.

jeauxbleaux
11-18-2013, 11:38 AM
Wow! I just looked at my database and was chagrined to see that it is just under 2G for about 600 movies and 4000 songs.

I suspect that there is a balance to be struck here. Making the database fully relational will dramatically reduce its size, but may reduce performance in some cases, like multiple users browsing menus on endpoint devices. Replication may shorten menu response time but balloons the database size, especially with all the duplicated bitmaps. It also slows down the Mezzmo application; adding files and editing entries can sometimes be a frustratingly slow process.

All that said, I think there has got to be a way to eliminate most if not all of this replicated data, particularly the thumbnails, without an excessive performance hit. At least now I know why my Mezzmo application runs so much slower than it used to. I know for a fact that MEZZMO.DB was about 65Mb with v2.x.

Please add my voice to the chorus to clean this up. Thanks.

Paul
11-18-2013, 12:11 PM
Thanks for the feedback.

JMC
11-22-2013, 06:07 PM
My DB is 7GB - and database maintenance in the new verions didn't help shrinking that. Would adrian please publish the script or you guys please fix this? The DB growed more than 5GB in the last 2 versions...

Paul
11-22-2013, 09:18 PM
We're looking into this for the next major version of Mezzmo.

Lifferds
11-22-2013, 11:36 PM
Mine is 406 mB but I found the MezzmoTranscode.db was 97 mB with only a single entry, even after compacting. I ran vacuum on it in SQLLite and it shrank to 15 kB. Looks like the compact feature does not compact all db's.

JMC
11-23-2013, 12:48 AM
Holy crap... As mentioned in another thread regarding the permanent task changing playlists... I just opened the database with a SQLite browser - MGOPlaylist has 102341 elements in my database. From the HUGE number of thumbnails inside the database (108533!): Am I right, that for a TV Show for example every episode gets an own thumbnail of the same picture used for the whole show? Now wonder this database is growing all over again...

Edit:
Deleting the dynamic video playlists for file type, actor and director followed by a compact compacted it from more than 7GB to 3,4GB. Compacted it before this to make sure it's the playlists... So you are telling me more more than 3GB inside the database is because of the dynamic playlists????

Paul
11-23-2013, 07:43 AM
The storing of artwork is not optimal at the moment, and this will be fixed for the next version.

jbinkley60
11-23-2013, 08:33 PM
This thread caught my interest because after upgrading from v2.7.1 to 3.4.x my database grew from around 150meg to over 900meg. Also I noticed that whenever I made any changes to the database I would see a status message saying that 5 active playlists were being updated and it would often last for 20-30 minutes (This is on an AMD 8 core prosessor with 32GB of memory). On a whim this morning I deleted all of my active video playlists and then compacted my database. It shrunk down to 250 meg and is much quicker. I left the audio active playlists in the database. I never really used any of the video active playlists because I have everything I want in Smart playlists. All the active playlists were in the database were the ones which came with Mezzmo such as genre, by year etc. Anyway now things are much faster and no more thrashing with the active playlist update messages for 20-30 minutes. My database has around 3,000 videos and 3,300 audio files.

adrian
11-24-2013, 07:08 AM
I feel cheated. My db is only 995KB and I have 1958 video files and about 20000 songs. It is like I am not getting the bang for my buck.

Oh wait, it might just be the way I have my system set up. Never mind you guys are doing great keep up the good work.

Nice one... maybe you meant 995 MB? It's mathematically impossible to store even the meta information for 21.958 files in under 995*1024 bytes since this would mean 46 bytes for each entry, including file path, record ID, thumbnail, etc.
That's assuming you don't have any kind of playlists or any kind of any other information in the database file... I think you get my point.

Anyway, I just updated to 3.4.3.0 and the issue is STILL here. My DB slowly grew again from 180 MB to 3.5 GB the moment the server restarted (it does this in the background until all playlists have a thumbnail, ofcourse duplicated from first file of that playlist).

Since I assume Mezzmo developers are clueless about DB optimizations and how relational tables work in general, here's my PHP script that does the cleanup for me. It requires the PHP binary ofcourse, to run it (outside the scope of this post), with the SQLite extension enabled.
Make sure the server does not currently use the database, but LEAVE IT RUNNING (so it doesn't start doing stupid stuff when restarted).

Save it as a .php file, then invoke the PHP binary with this filename as the argument in a cmd console (I personally created a BAT file that runs this full command).
Note the final step (compacting) will require much time since the journaling will first make a full database copy on disk (so you need at least that amount of additional free space), and then delete itself at the end.



<?php

// REPLACE the path below with your account's user directory
$db = new SQLite3 ('c:\Users\<username>\AppData\Local\Conceiva\Mezzmo\Mezzmo.db');
if (!$db) {
die ("DB init failed");
}
//$db->busyTimeout(60000);

// do everything in a atomic transaction to speed up the final commit
if (!$db->exec("BEGIN TRANSACTION")) {
die ("Transaction failed!");
}

echo "Removing dead playlists...\n";
do {
if (!$db->exec("delete from MGOPlaylist WHERE ParentID>0 AND ParentID NOT IN (SELECT ID FROM MGOPlaylist)")) {
die ("Query failed!");
}
$deleted = $db->changes();
echo " $deleted orphaned playlists deleted\n";
} while ($deleted > 0);

echo "Deleting all playlists thumbnails\n";
$db->exec("update mgoplaylist set thumbnailid=null");

echo "Removing unused thumbnails...";
if (!$db->exec("DELETE from MGOThumbnailData WHERE ID NOT IN (SELECT ThumbnailID FROM MGOFile)")) {
die ("Query failed!");
}
$deleted = $db->changes();
echo $deleted . " deleted\n";

$sql = $db->query("SELECT COUNT(DISTINCT Hash) FROM MGOThumbnailData");
if (!$sql) {
die ("Query failed");
}
$row = $sql->fetchArray();
echo "Distinct thumbnail hashes: " . $row[0] . "\n";
$sql->finalize();

$sql = $db->query("SELECT ID,Hash FROM MGOThumbnailData GROUP BY Hash ORDER BY ID ASC");
while ($row = $sql->fetchArray()) {
echo "Processing entry {$row['ID']}\n";
// TODO escape string - hash is Base64, so no need to escape now
$otherEntriesRes = $db->query("SELECT ID FROM MGOThumbnailData WHERE Hash='{$row['Hash']}' AND ID<>{$row['ID']}");
while ($rowOtherEntry = $otherEntriesRes->fetchArray()) {
echo " Removing entry {$rowOtherEntry['ID']}... ";
if (!$db->exec("UPDATE MGOFile SET ThumbnailID={$row['ID']} WHERE ThumbnailID={$rowOtherEntry['ID']}")) {
die ("Query failed!");
}
echo $db->changes() . " files, ";
if (!$db->exec("UPDATE MGOPlaylist SET ThumbnailID={$row['ID']} WHERE ThumbnailID={$rowOtherEntry['ID']}")) {
die ("Query failed!");
}
echo $db->changes() . " playlists affected\n";
}
$otherEntriesRes->finalize();

$db->exec("DELETE FROM MGOThumbnailData WHERE Hash='{$row['Hash']}' AND ID<>{$row['ID']}");
}
$sql->finalize();

$newId = 1;
$sql = $db->query("SELECT ID,Hash FROM MGOThumbnailData GROUP BY Hash ORDER BY ID ASC");
while ($row = $sql->fetchArray()) {
if ($row['ID'] > $newId) {
echo "Fixing entry {$row['ID']}... ";
if (!$db->exec("UPDATE MGOThumbnailData SET ID=$newId WHERE ID={$row['ID']}")) {
die ("Query failed!");
}
if (!$db->exec("UPDATE MGOFile SET ThumbnailID=$newId WHERE ThumbnailID={$row['ID']}")) {
die ("Query failed!");
}
echo $db->changes() . " files, ";
if (!$db->exec("UPDATE MGOPlaylist SET ThumbnailID=$newId WHERE ThumbnailID={$row['ID']}")) {
die ("Query failed!");
}
echo $db->changes() . " playlists affected\n";
}
$newId++;
}
$sql->finalize();

// The section below removes any internal subtitles for all entries, since I only use external SRT file. Uncomment only if you have a similar setup.
//echo "Fixing subtitles...\n";
//$db->exec("DELETE FROM MGOSubtitles WHERE SubsTrack>0");
//$db->exec("UPDATE MGOSubtitles SET SubsTitle=NULL, SubsLangID=NULL, SubsCharset=NULL");


if (!$db->exec("COMMIT")) {
die ("Transaction failed!");
}

echo "Compacting DB... this might take a few minutes or more\n";
$sql = $db->exec("PRAGMA page_size=4096");
$sql = $db->exec("VACUUM");

$db->close();

?>

Paul
11-25-2013, 07:56 AM
Thanks for sharing this, Adrian. As mentioned in my other posts, the database optimizations will be included in the next version.

JiKo
12-17-2013, 09:31 PM
Thanks for sharing this, Adrian. As mentioned in my other posts, the database optimizations will be included in the next version.

Please please my database is now about 7-9GB!!!! i have now 12GB free on my system partion. Why i can't move the database to a other drive?

KR JiKo

JMC
12-17-2013, 09:34 PM
With the newest version my DB was compacted by almost 2GB using the builtin mezzmo database compact, did you try this?

JiKo
12-17-2013, 11:48 PM
With the newest version my DB was compacted by almost 2GB using the builtin mezzmo database compact, did you try this?

Yes, but i can't.. when i do this, mezzmo need more as the double space for compacting. but my drive have no diskspace for this (after 1-5 minutes after starting compacting drive c: is out of space and the pc is slowly and will near stay.. :( and so say the mezzmo popup "compact size the same before and after the compacting" :(

JMC
12-17-2013, 11:55 PM
And there is a way to move the mezzmo database to a different drive (at least for compacting this would be a way and with NTFS drives): Using NTFS Junctions/Hardlinks and linking the Folder to a different folder on a different drive and keeping the database there.

Paul
12-18-2013, 10:15 AM
Why i can't move the database to a other drive?

You can move your Mezzmo database to another drive/folder. See this FAQ - http://forum.conceiva.com/showthread.php/6027-Tutorial-Moving-your-Mezzmo-database-to-a-different-drive-folder

Paul
12-18-2013, 10:17 AM
And there is a way to move the mezzmo database to a different drive (at least for compacting this would be a way and with NTFS drives): Using NTFS Junctions/Hardlinks and linking the Folder to a different folder on a different drive and keeping the database there.

Rather than using junctions or symlinks, I'd recommend following our FAQ to move to your Mezzmo database - see http://forum.conceiva.com/showthread.php/6027-Tutorial-Moving-your-Mezzmo-database-to-a-different-drive-folder.

JMC
12-18-2013, 05:16 PM
You're right Paul - I didn't know about the way with the ini (since I didn't move my DB) and thought if there is no other way there is this way :)

rbcrewser
01-08-2014, 09:02 AM
Shamelessly bumping this thread as the issue still seems unresolved.

FWIW my db file is close to 4GB with just 74 movie files, a tad over 16,500 music files and no play lists.

I realize adrian has kindly provided his php binary in a post above, but for those of us who don't want to go through installing and configuring php, sqlite, etc for the sole purpose of reducing the mezzmo database file (self-defeating if you have to install something to reduce disk space usage), there doesn't seem to be much of an option right now (unless of course some kind soul could convert that php to a vbs script for other to run?).

So is there a real solution on the horizon for this?

Paul
01-08-2014, 09:12 AM
We are fixing this for the upcoming Mezzmo 4.0.

adrian
01-13-2014, 08:06 AM
I created a ZIP archive with all files needed to run the database cleanup script. Since it's 3 MB I had to upload it (couldn't attach it here).

Download link: http://we.tl/AHkUJqJJFa (auto-expires in 7 days)

Instructions: unzip archive right next to the Mezzmo.db file. You should end up with the following file structure:
- [...]
- Mezzmo.db
- db_compact
___ - fixMezzmoDb.php
___ - run_cleanup.bat
___ - php*

Double click the run_cleanup.bat file - it will launch the php.exe interpreter which will load and run the fixMezzmoDb.php script.
The archive is self-contained, no additional runtime or libraries are required.
You will need to have an additional free space at least the same size as the Mezzmo.db file or else the compact will fail.. If your database is 7 GB, you need to have at least 7.01 GB of free space. After the compact finishes, you will get back both the 7 GB required by the temporary journaling, plus the trimmed 6+ GB from the original database file

Note that this solution is not permanent. When Mezzmo updates a dynamic playlist (after adding some new files for example), all affected playlists will again duplicate all thumbnails. Slowly it will grow, and grow, and grow... I would run this script daily for example.

adrian
01-13-2014, 08:30 AM
Sorry for two posts in a row, but I suggest the following to the Mezzmo developers, since they'll probably scratch their heads after the issue will be noticed in the future:

After you guys fix this redundancy/space problem, you will need to keep the existing user data. You probably understand by now that re-compacting a database that has gigabytes in size, will be impossible in the most scenarios, due to space and performance limitations. So here's my most valuable advice for your 4.0 issue-solver code: do not run a VACUUM on a database with gigabytes in size, since it could take like forever. Be smart and create a NEW database file, and copy the existing un-redundant tables and data from the OLD Mezzmo.db file. This will not require the same-as_db-size free space issue.

I could have done this myself in the script I posted, but since I have no idea what tables or columns you might decide to add in a future scheme, it's basically not something I would risk myself to do. Plus, the database might be updated by the Mezzmo process while the script is still copying data, so yeah... it's your responsability to solve this.

Paul
01-13-2014, 11:57 AM
Many thanks for all your helpful information, adrian. We'll run various tests on our new v4.0 database (including upgrading) and make sure it all operates OK.

scottier
01-16-2014, 04:57 PM
...Slowly it will grow, and grow, and grow... I would run this script daily for example.

Adrian, thank you for the script. It would not run on my computer until I downloaded the latest PHP files and placed them in the directory. How long did it take to run on your computer. After running for 12 hours I had to reboot so I don't think it ever completed. DB is still 5.1GB. I might run it like once every two weeks just due to how long it took to execute.

adrian
01-26-2014, 01:50 AM
Adrian, thank you for the script. It would not run on my computer until I downloaded the latest PHP files and placed them in the directory. How long did it take to run on your computer. After running for 12 hours I had to reboot so I don't think it ever completed. DB is still 5.1GB. I might run it like once every two weeks just due to how long it took to execute.

The operation time depends on database size, processing power, memory, and hard drive performance. I takes 10 minutes for my 4 GB database, but I have a powerful PC.

I've updated the PHP script to run faster, by copying the entire final content into a new database file instead of doing a reconstruct of the original, as I explained earlier. You can try running this and see how it works. It should be way faster, since it will only do a secvential write operation.

Note that at the end you have to delete the Mezzmo.db file and rename the Mezzmo_temp.db to Mezzmo.db. I can't do this automatically because you will need to manually shutdown Mezzmo server before you will be able to delete the database file.

Concepta
06-16-2014, 03:40 AM
Since I am not allowed to start a new thread I'll have to tag on to the end of this one! My gripe is that whilst some media servers allow the user to turn off thumbnaiing (altogether) Mezzmo does not. I insist on having this feature and if it's not included in the very near future I shall have to ask for a refund.
I have another gripe, and although it's not related to this thread I'm going to append it here (again because I am not allowed to start a new one + I dont' want to be bothered rummaging thro' old posts). It's a combination of two things: (1) that Mezzmo defaults to putting it's database + all its temp. files on C: (without 1st asking the user), and (2) it downloads updates to C: (again without 1st asking the user). I like, as far as possible to reserve C: for the OS and object to this behaviour.

Paul
06-16-2014, 10:18 AM
Not sure why you cannot create a new thread :confused:

In Mezzmo v4.0 we've just added backdrop artwork and we do allow you to turn these off (via the Metadata Retrieval Settings dialog), so for consistency we should also allow users to turn off poster artwork as well. We'll look into it for a future version.

In Mezzmo v4.0, you can now change/move the Mezzmo database location to any drive/folder in the Options dialog (General page). Click the folder icon in the Mezzmo database location edit box and click OK.

Regarding updates, you can change the folder using the following steps:


Using Windows Explorer, go to the Mezzmo program folder and run AutoUpdate.exe. Close this program after it runs.
Used Windows Registry Editor (regedit.exe) to edit your registry (be careful - make a backup of your registry before making any changes!) Go to the following location in your registry:
"HKEY_LOCAL_MACHINE\Software\Conceiva\Mezzmo\Autoup date\Folder" and you can edit the "Temp" entry to change the location of where updates are downloaded to.

ftanner
06-17-2014, 04:52 AM
Not sure why you cannot create a new thread :confused:

In Mezzmo v4.0 we've just added backdrop artwork and we do allow you to turn these off (via the Metadata Retrieval Settings dialog), so for consistency we should also allow users to turn off poster artwork as well. We'll look into it for a future version.

In Mezzmo v4.0, you can now change/move the Mezzmo database location to any drive/folder in the Options dialog (General page). Click the folder icon in the Mezzmo database location edit box and click OK.

Regarding updates, you can change the folder using the following steps:


Using Windows Explorer, go to the Mezzmo program folder and run AutoUpdate.exe. Close this program after it runs.
Used Windows Registry Editor (regedit.exe) to edit your registry (be careful - make a backup of your registry before making any changes!) Go to the following location in your registry:
"HKEY_LOCAL_MACHINE\Software\Conceiva\Mezzmo\Autoup date\Folder" and you can edit the "Temp" entry to change the location of where updates are downloaded to.


Be careful with moving the database. I did this and it wiped out all of my Mezzmo settings and database contents. I suspect that the contents are still there since the folder where I moved them to is almost 3GB in size, but Mezzmo no longer sees the data. It's been just shy of two weeks now, and despite being told on the 10th that they had a fix for it, I am now being told that I have to re-submmit my log files. I made the mistake of assuming that they had the database movement working. They do not. Sure, I could completely re-create my database, but I *SHOULD NOT* have to do this after using a feature that they say is working. Especially considering that I have about 700 movies and dozens of TV shows, in addition to about 500GB of music.

Concepta
06-23-2014, 02:50 AM
Go to the following location in your registry:
"HKEY_LOCAL_MACHINE\Software\Conceiva\Mezzmo\Autoup date\Folder"
and you can edit the "Temp" entry to change the location of where updates are downloaded to.


I found 3 "AutoUpdate" keys in the following locations:
HKLM\SOFTWARE\Wow6432Node\Conceiva\Mezzmo\AutoUpda te\Folder
Name: Temp

HKLM\SOFTWARE\Wow6432Node\Conceiva\Mezzmo\AutoUpda te\Folder
Name: Logs

HKLM\SOFTWARE\Wow6432Node\Conceiva\Mezzmo\AutoUpda te\Folder
Name: Exec

Paul
06-23-2014, 07:59 AM
That's right. Edit the "Temp" entry to change the folder location where your Mezzmo updates are downloaded to.