bugzon - an agile ajax-bugzilla extension

I've started experimenting with Bugzilla's json:rpc interface for crating a quicker JS-UI - and even gone as far as making use of localStore for offline use and views of bugs. Then recently I was pointed to trello.com that does all these things magically on their all new service. I am still making an attempt to finish what I started in Bugzilla - people still use it you know. Here is a first screen dump:

Tar-Tiff plugin for Thunderbird

I use Thunderbird and have friends that use Mail.app. Mail.app inlines tiff images when images are copy pasted into mails. For some reason relating to web-standards Thunderbird seems unwilling to support perfectly free things as libtiff for solving this issue. See bug here: https://bugzilla.mozilla.org/show_bug.cgi?id=160261 The worst part in particular is that Thunderbird fails to make the file in any way accessible, even for downloading and viewing in an external program.

So I started looking at solving this with a plugin of my own and I found some good pieces out there: 
A: http://code.google.com/p/tiffus/ - A java script library (written i GWT) for parsing TIFF among other files. This would make loading OS-independent and easy to integrate for sure. 
B: http://blog.xulforum.org/index.php?post/2011/03/14/Basic-MimeMessage-demo - A small MimeMessage inspector that could be modified to look for inline images and attachments. 
C: Time to test things out. Or perhaps a sponsor for the full Thunderbird extension?


Google ready to kill software patents?

Google on Oracle vs Google: "Each of the Patents-in-Suit is invalid under 35 U.S.C. § 101 because one or more claims are directed to abstract ideas or other non-statutory subject matter."
CUDOS Google! Refusing software patents like this the right thing to do for innovation!
More at groklaw: http://www.groklaw.net/article.php?story=20101111114933605


Java finally adds NIO2.

Java 7 comes with NIO2, "New I/O version 2",  stupid name I know, but it's packing some extremly important functions.
New functions that will enable us to do faster indexing, trace changes in filesystems and read more file attributes such as users and groups.
I have been waiting for this since early 2002 when the poposal for NIO2 came. I almost gave up hope on Java since then.
This makes it possible to update some core I/O functions in corpus and in our public java libraries for indexing. 


Gentle local file indexing, please

Unlike websites, local file systems tend to give much better feedback on file changes. Still, most search-solutions use considerable I/O, something that is very annoying. Users are annoyed to the extent that they completely uninstall the search and indexing altogether. - I've done that with google, microsoft and other desktop search tools too.

Still I know that there is a difficult balance to all this. Today there is good OS support for file events. I recently read this post about using .NET API:s to monitor changes in file systems. There are also Linux versions as INotify or the kernel deamon auditd to do the same by listening to kernel events. The manual OS-independent method is to watch the modified time stamp for changes on all folders. Worst case is to have to scan the entire folder tree for changes, as if virus-scans where not annoying alone.

The event monitoring solutions work as long as they are on, but changes go unnoticed while the listening agent are off, and they need to fall back to scan for changes mode if they are switched on again - costing considerable annoying IO activity. Then comes the indexing that really pick up the IO...

* Late indexing and grouped indexing before searching, just log changes until then.
* While idle, create a low priority process for indexing groups of changed/new documents in one phase
* Push and convert documents by type to be nice to I/O by i.e. converting all doc files to XML at once.
* Push indexing to a remote server to reduce the load.

Anything else to consider for desktop/enterprise search?


Experimenting with Lucene 3 and parametric search

I have experimented with apache lucene 3 and parametric search:

Its just a test shot, but it:
* Gives you all search results on a single page - as a speed test
* Summarizes inventors, applicants, patent-classes etc from the result as an overview.
* Is some playground for further java-script interaction with the results.

Indexing was kind of slow on my quad core, 120k patents took about 40 minutes, but searching is fast. It made me think about setting up some test suits for indexing. Like:

* SOLR / Lucene for all Swedish municipal-sites (250+)
* SOLR/ Lucene for a typical windows server environment.

Anyone done this already perhaps?



New hope for system wide secure memory with G1-GC

I had this idea years ago about segmenting GC memory on small process islands and using proxies between them in order to speed up and use global - yet - secure memory addressing system wide. This idea is not brand new, but it combines the best from distributed memory models with tight integrated memory and security, or so was my idea back in 2003 at least. It didn't sell well though. Yep, I tried!

But here comes new hope, The new GC "G1" for java 1.7:

With better languages on top of the JVM, like scala and groovy, it seems java might have a chance to bring us this cool platform, even though it has considerable legacy problems.



BBC reports in study that patens at universities are blocking innovation

BBC reports that Canada-based Innovation Partnership, a non-profit consultancy, states in a newly released report that "`Blocking patents' are delaying advances in cancer medicine and food crops" and that "the full benefits of synthetic biology and nanotechnology will not be realised without urgent reforms to encourage sharing of information". http://news.bbc.co.uk/2/hi/science/nature/7632318.stm the full report is available (pdf, CC-license) at http://www.theinnovationpartnership.org/en/ieg/report/