Parsing RSS on an 8-bit Micro

One of the most common questions asked of my Embedded RSS Reader is how to process the XML data of an RSS feed on the 8-bit microprocessor.

In a conventional application, it is trivial to use an XML library to parse the data into a tree-like structure. In PHP5, for instance, you could simply write:


$xml = simplexml_load_file("myfeed.xml");

Voila! A complete XML tree is now stored in $xml. On the AVR this is slightly more problematic because there is simply not enough memory to store the textual feed, let alone the corresponding tree structure.

The good news is that we don’t actually have to store anything except the required information. We process the incoming data stream one character at a time using a regular expression to match the appropriate tags and capture groups to extract the title, description and link.

Obviously, the AVR cannot use a Perl-style regular expression. That would use too much memory and require too much computation. Furthermore, some RegEx engines work backwards through the data so the entire downloaded document would need to be stored somewhere.

We must go back to the raw elements of how a RegEx works. It is, after all, a representation of a non-deterministic finite automaton (try saying that five times fast). Digital electronics love state machines, and computer software is no exception. The net result, a finite state machine (DFA) with a hundred-or-so states that can extract the relevant parts of an RSS document on the fly.

Continue reading

, , , , ,

SEO: The Black Art

I am currently grappling with a client’s under-performing website. Our conversion rate is fairly good, which is understandable with the good products and range of price points. We cannot, however, actually get people on to the website in question.

The site has been examined by countless SEO companies who recommend increasing the website’s exposure, obtaining links and countless other (downright obvious) activities. Of course, they want a fairly nice price for this.

Not a single one has picked up on the core problem. Nobody is searching for the products. The local search volume on their ‘recommended’ keywords is a few hundred last year and, in one case, declining rapidly.

Sometimes, SEO is not the problem. The products need marketing off-line to convince people they need them.

Continue reading

, ,

Embedded RSS: in action

I’d just like to share a video of my Part III Project in action (via Youtube).

Here the device is connecting to the BBC News Front Page feed and downloading the top articles.

More info here.

Continue reading

A Better Real-time Library for Arduino

For my project I required a simple timer library for the Arduino development environment, something with a simple-but-powerful interface to schedule tasks to complete in the future or at regular intervals.

There doesn’t really seem to be anything suitable online – a host of over-complicated APIs that use several classes, structs and rely heavily on other libraries. Moreover, these tend to poll as part of the main loop whereas I wanted something that will run in the background leaving the processor free to work on more important stuff.

My library (attached here) achieves this – a lightweight, interrupt-driven scheduler that can run tasks at a second resolution.

The API is really simply:

  • Timer();
    Handles all initialisation functionality.
  • timer_h once(int interval, callback_h callback);
    Schedules a new, one-shot timer to invoke the function callback at interval seconds in the future.
  • timer_h repeat(int interval, callback_h callback);
    Schedules a new timer to invoke the function callback every interval seconds, until the timer is deactivated.
  • void enable(timer_h timer);
    Enables a timer, referenced by timer, that has previously been disabled.
  • void disable(timer_h timer);
    Disables a timer, referenced by timer, effectively pausing it, whilst leaving it active on the system.
  • void destroy(timer_h timer);
    Removes a timer from the scheduler, freeing up the space for a new timer.

Currently, up to 10 timers are supported, although the MAX_TIMER definition in Timer.h can adjust this for more functionality or to save memory. The timer variable must be called RTOS.

Code example:

#include

// This is a nasty hack to resolve some weirdo errors in where we
// redefine some system types and functions, for some unknown reason. Rather than
// modify the header file to resolve this conflict, we remove the redefinitions
// here.
#undef int
#undef abs
#undef double
#undef float
#undef round
// End Hack

Timer RTOS = Timer();
timer_h rep;

void setup() {
Serial.begin(9600);

RTOS.once(5, t_once);
rep = RTOS.repeat(1, t_rep);
}

void t_once(void) {
Serial.println("World!");
RTOS.destroy(rep);
}

void t_rep(void) {
Serial.print("\nHello ");
}

void loop() { /* nil */ }

The library has been tested on an Arduino Duemilanove, with an ATmega328.

Download: RTOS.zip

Continue reading

Reading RSS on the AVR

Arduino RSS Reader

Last February, I saw an RSS Reader implemented on an AVR ATmega8, which seemed impressive but was, in reality, just a serial-controlled LCD terminal. The general consensus was that it’d be cool if this were done without the requirement for the PC – this was the topic of my third-year project at university.

My Embedded RSS reader uses an Atmel AVR, on an Arduino development board, complete with Ethernet connection and a 4×20-character liquid crystal display. It connects to a network using DHCP to configure its own IP address before downloading the latest stories and displaying them.

I’ll update more about this project on it’s sandbox page as soon as I can.

Continue reading

, , , , ,

Another Step Forward

I now have all the component parts in place for my Embedded RSS reader, Ethernet, TCP and the liquid crystal display (who knew the pin-out was backwards?). The software is almost complete too, just a morning of integration and a little testing then there should be something to post here in the form of a video.

More features to add though: configuration; DHCP (I hate the hard-coded IP address); and ‘Send to my PC (or Mac)’ to load the story in a real web browser.

Continue reading

Downloading RSS to the AVR

I have finally managed to fix a nasty little bug with my AVR-powered RSS reader (more details coming here shortly).

Basically, it would get half-way through downloading the RSS document, pause for a second and the AVR would restart itself. I thought it was an issue with the network connection, but trying to diagnose over the serial connection didn’t help – again it would pause and restart.

Top Tip: don’t overflow your heap! I small typo (additional 0) in one of my definitions of data size was causing the heap to occupy 1000% of the AVR’s internal memory space – it’s hardly surprising that eventually some piece of data wrote over the return address of the function in the stack.

Continue reading

AVR Development on the Arduino

I have just started using an Arduino for my project. I hadn’t considered it as a platform before, but I am starting to love it for rapid development. It is so usable – no loose wiring to connect up programmers or power supplies – it’s all integrated into a single unit.

I very nearly have the basic functionality complete, now to see how far we can push this…

Continue reading

, ,

Pointers on the Atmel AVR

I have just resolved a rather weird bug in a 3rd party library for the Atmel AVR. Compiling under OS X the code runs perfectly, but on the AVR it runs like a dog.

The secret is, apparently, in the heavy use of C pointers and pointer arithmetic to manage complex data structures within the AVR itself. The AVR program counter is a 16-bit value, twice the word length of the microprocessor itself, so requires a special register to store the pointer: specifically one of three registers X, Y and Z at the top end of the register file.

Now, the GCC compiler will use one register to manage the stack, leaving a grand total of 2 for the general code to use. This could well explain the problem – every memory access requires two more reads to populate the value of the pointer register. Memory access is slow – O(ms) – and happens far too often in the code.

Top Tip: lay off the pointers!

Continue reading

, , ,

Debugging in OpenGL

The computer scientist, Brian Wilson Kernighan once said:

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

This is well enough, but trying to do some graphics programming, using the OpenGL, I am finding it nigh all impossible to debug at all. A black window, whilst a sign of failure, is rather difficult to diagnose.

The problem is worsened when weird and wonderful things start to happen. For some reason, rotating the camera though 2π radians causes the lighting to change – and spinning the camera on a spot causes a pleasant effect with the light fading in and out. Unfortunately, this is not the prescribed action.

Continue reading

, , ,

prev posts