{"id":632,"date":"2006-04-01T16:21:32","date_gmt":"2006-04-01T06:21:32","guid":{"rendered":"http:\/\/www.flamingspork.com\/blog\/2006\/04\/01\/getting-rid-of-duplicate-emails-elegantly\/"},"modified":"2006-04-01T16:21:32","modified_gmt":"2006-04-01T06:21:32","slug":"getting-rid-of-duplicate-emails-elegantly","status":"publish","type":"post","link":"https:\/\/www.flamingspork.com\/blog\/2006\/04\/01\/getting-rid-of-duplicate-emails-elegantly\/","title":{"rendered":"getting rid of duplicate emails, elegantly"},"content":{"rendered":"<p>I like duplicate emails in the way that everybody is thinking. This is different.<\/p>\n<p>Due to a bug in offlineimap i hit a little while ago, it&#8217;s managed to make copies (sometimes even two copies) of each email in certain folders. Now, this isn&#8217;t so bad as<\/p>\n<p>a) email didn&#8217;t get lost<\/p>\n<p>b) it&#8217;s just using extra disk, and disk is cheap.<\/p>\n<p>but it is annoying when searching.<\/p>\n<p>It&#8217;s also annoying because it&#8217;s decided to do this on folders such as INBOX\/MySQL\/bugs which contains an email for each change to a bug report even since I joined the company. That adds up to a lot of wasted inodes and disk blocks.<\/p>\n<p>So, I&#8217;ve revived this project that I have in the back of my head of efficiently storing email in a database and being able to sync between instances of it.<\/p>\n<p>This gives us some nice advantages. you can use replication to keep a backup of your email. You can put it in Cluster and have high availability email.<\/p>\n<p>We can also do some neat tricks with tables of all that info that you need to display lists of emails and probably get performance boosts instead of having to open each mail as we currently do. i.e. current email solutions don&#8217;t scale to a million emails in a folder.<\/p>\n<p>Partitioning will also be useful to make searches quicker (odds are what we&#8217;re searching for is recent and all sorts of foo).<\/p>\n<p>Anyway&#8230;. it&#8217;s interesting to see the bunch of errors that gets thrown up by the Mail::Box perl module on some of my Maildirs. Hrrm&#8230; I may have to resort to my own more error tolerant code. I&#8217;m determined to write scripts that can not possibly loose anything.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I like duplicate emails in the way that everybody is thinking. This is different. Due to a bug in offlineimap i hit a little while ago, it&#8217;s managed to make copies (sometimes even two copies) of each email in certain &hellip; <a href=\"https:\/\/www.flamingspork.com\/blog\/2006\/04\/01\/getting-rid-of-duplicate-emails-elegantly\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[14],"tags":[],"class_list":["post-632","post","type-post","status-publish","format-standard","hentry","category-mysql"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5a6n8-ac","jetpack-related-posts":[{"id":809,"url":"https:\/\/www.flamingspork.com\/blog\/2007\/03\/28\/patching-your-mission-critical-email-syncing-software-on-your-life-setup-my-offlineimap-patch-for-today\/","url_meta":{"origin":632,"position":0},"title":"Patching your mission-critical email syncing software on your life setup&#8230; my OfflineIMAP patch for today","author":"Stewart Smith","date":"2007-03-28","format":false,"excerpt":"I've used OfflineIMAP for quite a while now. On the whole I'm fairly happy with it. Today I sent this to the list: Forgive the potentially bad python, not my native tongue :) This patch is motivated by three things: - offlineimap is extremely slow at syncing lots of locally\u2026","rel":"","context":"In &quot;life, the universe and everything&quot;","block_context":{"text":"life, the universe and everything","link":"https:\/\/www.flamingspork.com\/blog\/category\/life-the-universe-and-everything\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":586,"url":"https:\/\/www.flamingspork.com\/blog\/2006\/02\/17\/ndb-disk-requirements\/","url_meta":{"origin":632,"position":1},"title":"NDB Disk Requirements","author":"Stewart Smith","date":"2006-02-17","format":false,"excerpt":"up to 3 copies of data (3*DataMemory) + 64MB * NoOfFragLogFiles (default=8) + UNDO log (dependent on update speed) For example: DataMemory=1024MB idea on disk usage= 1024*3 + 64 * 8 =\u00c2\u00a0 3584MB + UNDO log It's very tempting to have a \"SHOW ESTIMATES\" command in the management client\/server that\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":751,"url":"https:\/\/www.flamingspork.com\/blog\/2006\/10\/23\/weekly-builds\/","url_meta":{"origin":632,"position":2},"title":"weekly builds","author":"Stewart Smith","date":"2006-10-23","format":false,"excerpt":"Saturn's autoweb I've hacked my scripts that generate doxygen docs to also build MySQL 4.1, 5.0 and 5.1 for AMD64 (the box that it's running on) with Cluster. This is to help my idea of running Gallery at home with NDB disk data tables in very recent MySQL builds. How's\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":897,"url":"https:\/\/www.flamingspork.com\/blog\/2007\/09\/27\/things-that-break-while-travelling\/","url_meta":{"origin":632,"position":3},"title":"Things that break while travelling&#8230;.","author":"Stewart Smith","date":"2007-09-27","format":false,"excerpt":"This year, it seesm that whenever I go out for significant travel, the following things will break on my trip: a laptop power supply a disk At least this time the disk is part of a RAID1 array. Oh, and for some reason my mythbackend stopped doing anything a few\u2026","rel":"","context":"In &quot;life, the universe and everything&quot;","block_context":{"text":"life, the universe and everything","link":"https:\/\/www.flamingspork.com\/blog\/category\/life-the-universe-and-everything\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":334,"url":"https:\/\/www.flamingspork.com\/blog\/2005\/01\/06\/effective-bk-usage\/","url_meta":{"origin":632,"position":4},"title":"effective bk usage","author":"Stewart Smith","date":"2005-01-06","format":false,"excerpt":"(inspired by jimw talking about it on Planet MySQL) I take a bit of a different approach... I've got directories for 4.0, 4.1 and 5.0, and within them, i have clones of the main ndb tree (called ndb, so there's a path like \"MySQL\/5.0\/ndb\"). I don't ever edit in this\u2026","rel":"","context":"In &quot;mysql&quot;","block_context":{"text":"mysql","link":"https:\/\/www.flamingspork.com\/blog\/category\/work-et-al\/mysql\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":759,"url":"https:\/\/www.flamingspork.com\/blog\/2006\/11\/13\/disk-allocation-xfs-ndb-disk-data-and-more\/","url_meta":{"origin":632,"position":5},"title":"Disk allocation, XFS, NDB Disk Data and more&#8230;","author":"Stewart Smith","date":"2006-11-13","format":false,"excerpt":"I've talked about disk space allocation previously, mainly revolving around XFS (namely because it's what I use, a sensible choice for large file systems and large files and has a nice suite of tools for digging into what's going on).Most people write software that just calls write(2) (or libc things\u2026","rel":"","context":"In &quot;linux-kernel&quot;","block_context":{"text":"linux-kernel","link":"https:\/\/www.flamingspork.com\/blog\/category\/linux-kernel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/632","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/comments?post=632"}],"version-history":[{"count":0,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/posts\/632\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/media?parent=632"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/categories?post=632"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flamingspork.com\/blog\/wp-json\/wp\/v2\/tags?post=632"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}