fsync() Across Platforms

When an application writes a file, the data does not become permanent immediately. The write operation first moves the data into the operating system cache in RAM, where it is vulnerable to system crashes and loss of power. The second step is the transfer to the hard disk, which normally has write caching enabled. The disk acknowledges the data straight away, but keeps it in the disk write cache which is still volatile memory. The data is now safe from system crash[1. Ignoring some worst case scenarios.], but is not safe from loss of power. On a modern disk, this may be 16MB or more of data in unknown state.

As performance enhancements in ext4 have made committing data to disk a contentious issue, I’ve written a note on how different platforms handle data consistency.

Read more

The Mystery of ProxyPassReverse

The mod_proxy_ajp module for Apache has many advantages over mod_jk for connecting a Tomcat server to an Apache front. For me, the crucial advantage was the ProxyPassReverseCookiePath directive, which allows me to map the session cookies of a Tomcat web application (other than the root application) into the root of a virtual host.

Unfortunately, many tutorials contain misleading advice, and recommend this pattern for the ProxyPassReverse, which will break if the web application issues a redirect:

ProxyPass /jspdir ajp://localhost:8009/jspdir
ProxyPassReverse /jspdir ajp://localhost:8009/jspdir

Read more

Wrapping a Native Library with Maven

I recently converted a large project to build with Maven. The project contained both C++ and Java code, and produced a web application, a standalone server application, plus a number of small command line tools. The project used a large number of open-source Java libraries, and Maven tamed these easily. The native C++ library proved harder, and this is the approach I took.

The code snippets below are part of a complete example that builds a tiny Java/C++ application under Linux. This should port easily to other Unix-like platforms, and may provide some help to performing the same task under Windows. The example is available in tar.gz and zip formats.

Read more

A Working TFTP Server for Multi-Homed Linux Systems

Linux machines with multiple network interfaces are unreliable as TFTP servers. This issue has been outstanding for a long time, without any resolution. The patch attached to the Debian bug fixes the problem for an old release of tftpd-hpa, but does not apply cleanly to recent releases.

Recent releases of dnsmasq contain a TFTP server which does not have this problem. While this doesn’t solve every case, it provides a tidy solution for a machine which provides BOOTP and TFTP services to several subnets.

Read more

UTC, SQL Server, and Spring

I’ve recently been introducing the Spring Framework into an existing Java application, using it to speed up adding new features, while making the existing JDBC code more maintainable. One tricky area has been time handling: the application uses an older SQL Server version, so cannot take advantage of the implementation of timestamp with time zone in SQL Server 2008. All the time fields are kept in UTC, and the application must be careful that all the times are converted to and from UTC correctly. With pure JDBC this is handled explicitly, but with Spring JDBC access this is implicit.

Read more

Mobile Proxy Servers

Many mobile data services implement a forced cache on access to port 80. These caches often have the unfortunate assumption that the access comes from a web browser, and that a human being will look at the page. Vodafone completely reformats page content, while T-Mobile simply recompresses images at a lower quality. For a human user, this can be a nuisance. For an embedded application, content transformation can be far more serious.

There are several workarounds possible:

  • Use SSL. This completely avoids the problem, at the cost of extra data transfer and a longer setup time.
  • Arrange with your mobile data provider to turn off content transformation for your SIMs, or for accesses to your server. It can take a long time to find the right person to arrange this, and the process has to be repeated for every network you use in every country.
  • Add a Cache-Control header to your http requests, and set a meaningful User-Agent header.

Read more

Running Linux on a PCI Add-in Card: Hardware

Every so often I see someone attempting to run the Linux kernel on a PCI add-in card. I’ve done this myself, but there are a lot of complications. This article covers the hardware, and a second article will cover software. Don’t take this as chipset selection advice: before you commit to hardware double-check both the errata and the availability of the silicon.

Read more

SSL Handshake Overhead for Mobile Devices

If you’re designing an application where devices communicate with a server over a mobile network, there are trade-offs between implementation effort and data transfer. This may not apply to a consumer application, where the application developer doesn’t have to pay the data charges. But if the application is M2M these trade-offs matter.

Read more

In the Year 2038

I have now seen my first ever year 2038 bug. An embedded Linux system that was installed two years ago became unable to acquire a network address by DHCP. The machine did not require an accurate clock, and nobody had initialised its battery-backed real-time clock. Once installed, it had started counting forward from 1st January 1900.

32 bit Unix time covers a range from December 13th 1901 to January 19th 2038. As the real-time clock value was outside this range, Linux wrapped the time round to the year 2036. After the machine had been running for nearly two years, it passed through the 2038 rollover and jumped back to 1901.

This would have been harmless in itself if all the applications on the machine used a monotonic clock, such as the uptime counter returned from the sysinfo function. But the machine in question used an older version of Busybox, and the udhcpc DHCP client in that release failed when faced with a time in the negative number range before 1st January 1970.

The moral of the story? Even though a machine doesn’t need a real-time clock function, it may not be immune to clock related bugs.

Vorbis on DM642

Theora video on the DM642 may not be entirely successful, but Vorbis audio is a different story. I’ve been experimenting with the Tremor integer-only implementation of Vorbis decoding.

Tremor offers two modes of operation. Normal mode, and low precision mode. Normal mode requires 64-bit intermediate results in arithmetic operations, whereas low precision mode only requires 32-bit intermediates. Testing both modes against the standard Linux command line vorbis decoder, oggdec, reveals that the normal mode has a RMS error of 0.71 bits, whereas the low precision mode has RMS error of 58 bits. (I performed the test using Lepidoptera by Epoq from vorbis.com as the sample track, decoding to 16 bit, 44.1kHz stereo.) The result for low precision mode is consistent with user complaints of audible distortion.

The good news for Vorbis on DM642 is that using 48 bit intermediate results produces results very close to the normal mode, with RMS error of 1.0 bits. The mpylir instruction of the DM642 multiplies a 16 bit by a 32 bit quantity, and shifts the result to fit within 32 bits. This allows a decoder with quality almost indistinguishable from normal Vorbis output, but performance as fast as Tremor’s low quality mode.