Categories
gsoc livejournal Uncategorized

If there was one thing you wish you had known before…

One of the questions in the student evaluation of the Google Summer of Code reads:

If there was one thing you wish you had known before getting started in Summer of Code, what would it be?

It is a very typical evaluation question and we all sortof know what it means and how to answer it. However, if you insist on thinking about it – and this is very acceptable behavior in some circles – it is actually a very difficult question.

I tend to read this question as follows: if you could meet yourself in the past, what would you tell your past self?

Well, I would probably give myself the final git repository, plus an external hard disk with as much of the interesting new information on the present day Internet as possible.

What would my past self do with that information? He would probably decide to use the outcome of a couple of footy matches to make a decent living. But apart from that, he would pick another project. Not because my project is uninteresting, but I really enjoyed that part of the project that I worked on. Continuing to work on this particular code base is really interesting, but not as interesting as it was to build it in the first place.

So my conclusion is: the more advice that I give to my past self, the less intersting his project would become. This is not a real problem however, because the information would provide him with many new opportunities.

I have the same view on Sudoko puzzles. A friend of mine wrote a computer program, while he was drinking beer in the pub, that could solve a lot of these puzzles. Many fanatical puzzlers would never consider using such a program; it would take away the fun.

I completely disagree with them. Now that my friend has releaved the world of The Sudoko Problem, mankind can move on to solving new problems.

I do not understand why people take such pleasure in creating artificial problems and then solving them over and over again, when there is an astonishing abundance of problems already out there waiting to be solved.

Just to make an even bolder statement: anyone who spends even a minute a day solving problems that have already been solved, should feel really guilty about climate change, poverty, diseases, slow public transport and millions and millions of other problems. Well at least, I tend to look at my own behavior from that perspective. All that without losing the ability to enjoy live; that is the real tricky part.

Categories
Australië gsoc livejournal

The end of the Summer – let there be Summer!

Although there are still a couple of days left until the official Pencils Down date of the Summer of Code, I am now officially putting my pencil down because I need to catch a train to Adelaide tomorrow morning.

I guess this really marks the end of my student period; even though I graduated in June, this project allowed me to feel like a student just a little longer. Sniff, now I really have to enter the big scary adult world.

But first I will go on a trip for two weeks to see Adelaide, the Ghan train, Darwin and Kakadu National park. It will be a very culturally diverse trip; from what I have  heard, Adelaide and Darwin are pretty much as different as it gets here in Australia.

Route profile demo by Lambertus
Route profile demo by Lambertus

I am very happy to see that my application has found its way to an actual route planner website (see figure above)!

So what is next? Well, I will obviously have much less time to work on this project, so my highest priority will be to explain other people how to use and improve the application and how to install it on their own server. So don’t hesitate to mail me about that!

I have to keep this post short because I still have to pack some stuff and it is already late. But I do want  to thank some people of course. Thanks Google for sponsoring me (and for creating all sorts of cool and useful tools for my project). Thanks OpenStreetMap community for selecting my project, your confidence in me and your support. And of course, thanks Artem for mentoring me during the project and for being a great and interesting person to talk to in general!

These are just thankyou’s, not goodbyes. So see you soon!

Categories
gsoc

Weekly update route altitude profile

It’s a bit of a boring title, but it actually has been an interesting week.  Although I found myself highly distracted by some unrelated but fascinating things, I still managed to get quite a bit done.

The script that I used to download the SRTM data set and import it into a Postgres database can now deal with all continents and supports uploading a subset of a continent by means of a bounding box. I also put the md5 check sum of every tile in the source code.

Since the App Engine still has some issues, I have revived the Postgres version of my application. It is located at http://altitude-pg.sprovoost.nl and contains most of Europe as far east as Moscow and as far south as Cyprus. It runs on my home computer in The Netherlands, so please be nice to it. I use apache-mod-python for the formal demonstration website and apache-mod-wsgi combined with web.py for the altitude profile server. To make this as painless as possible, I have moved all App Engine and Postgress specific code to their own files and kept as much common functionality as possible in the main file. I can now run the development servers for both Apache and the App Engine from the same source code folder, at the same time.

I have requested more storage space on the App Engine and I am also considering a more efficient storage method. In stead of storing one altitude per record, I could store 100 altitudes per record and zip them. That would drastically reduce the total storage requirement, but at the cost of performance because I often need only about 2 out of these 100 altitudes.

I have also been a bit more active on their mailing list; it feels good to be able to answer peoples questions and at the same time it allows me to verify my own code and design. There are also some interesting albeit more philosophical discussions on the list.

I have signed and fulfilled a pledge to “spend an hour OpenStreetMapping features on Caribbean islands from Yahoo! aerial imagery and […] donate £10 to the OpenStreetMap Foundation but only if 60 other people will do the same.”. I felt like I could really use another jet-lag. The pledge is full, but who knows, if they can rally another 60 people there might be a second ticket?

Those of you who laboriously follow every commit to the OpenStreetMap subversion repository, may have noticed that I am still struggling with git-svn. I got really tired of fixing conflicts, so I unleashed the power of git-svn set-tree:

git-svn set-tree -i trunk 3cb585dca1d7fe10791312ca26125168506b61c1
git-svn set-tree -i trunk 07c9024f5ea4ce60f481b8089b61d4988e7588fa

Even the manual recommends against doing this, and you should make sure nobody else (like your mentor) has committed anything to subversion before you do this.

I find git-svn to be harder to use than it should be. I think it is trying to hard to properly translate between The Git Way and The Subversion Way. I just want the subversion repository to ‘sort of’ track my git repository. I don’t care if it has to represent the history a bit different. Just keep the code up to date. I am looking forward to this command:

git-svn just-do-it

I really think Git would benefit the OpenStreetMap community, because it reflects the decentralized nature of OpenStreetMap. With Git, there is no such thing as a central repository. People can write any code they like without having to live in constant fear of breaking the trunk with their next commit. In stead, when they build something cool or useful, they will tell their friends to pull it in and experiment with it. The person who operates a production website will only pull pieces of code that he or she considers safe and useful enough.

But the reality is that many organizations rely on subversion at the moment and have excellent reasons for not risking their operations by making an instant jump to Git. So people are not going to adopt Git very quickly as long as it is so hard to sync with subversion. But lets wait for a while and see…

I am getting better and better at keeping my git repository synchronized with the osm subversion, but I would not recommend this strategy to others.

I created a project on Google Code Hosting project for the altitude profile. Not to host the code, not even for the wiki, but just to keep a list of issues. I realize I could have applied for a place on the OpenStreetMap Trac, but I want to use Google Code Hosting for my new project: Jobtorrent. This is also the reason most of the issues point to the Git source (I do point to  subversion on the main page and the only reason I do not always point to both is that I am lazy). I will write more about Jobtorrent later; first I need to work on my Summer of Code project you know…

This list of issues should be good for continuity. Because my project does not interact with any OpenStreetMap code at the moment, I am probably the only one in the community who knows how the code works and what needs to be improved. That is a very low bus factor! (“tram factor” would be a better term in Melbourne) Now I really like the OpenStreetMap effort and I will certainly find ways to stay involved in the future, but it might be in a completely different project. Depending on circumstances, I should at least prepare for the possibility that the altitude profile project will be orphaned within a few months.

I use a personal organizing method inspired by the book Getting Things Done (David Allen) and that makes it very easy to transfer everything I am working on or thinking about to the Internet. So that is what I did.

The more difficult part is keeping it synchronized. David recommends that you never share your projects. That is, you should always keep your own lists and let nobody else touch them. Your lists must reflect what you want, or you will start to rebel against them and as a result mess up your system.

So in practice you will end up with a central list (e.g. the list of issues on Google Code) and your local copy of it. They will not be the same. There are a couple of things on my personal list that are not online (nothing ground braking, don’t worry) and my own priorities are not identical to the ones online. The online version reflects what is important for The Project, the offline version reflect what is important for me. At least in theory; as long as I am the only one working on it, it probably reflects my opinion a lot better than it ideally should.

Now I am pretty sure the average recruiter looking for a “true team player” does not like what I just said in the last paragraph.

Categories
gsoc

Is Google evil? And why the world is happy I’m not a CEO.

I have recently started following Umar Haque’s blog.

Umair Haque is Director of the Havas Media Lab, a new kind of strategic advisor that helps investors, entrepreneurs, and firms experiment with, craft, and drive radical management, business model, and strategic innovation.

He’s written a manifesto for the next industrial revolution and vouched to provide free consulting to five web 2.0 start ups that actually do something useful for a change (i.e. change the world).

He started an open discussion around the question whether Google is evil or not. I find this a fascinating question. Many people find great comfort in just answering the question with “yes” and pointing to some good examples. However I’ve met a couple of Googlers here and there, read some blogs and read the book. That convinced me that there are at least some people within the company who firmly believe in the Do No Evil motto and work very hard towards that goal.

Also, saying that Google as a whole is either evil or good implies a conspiracy. It implies that a select group of people with evil and greedy plans (I guess most people would point to the shareholders and managers because they wear suits) is in complete control of the company. Not only that, they deliberately employ naive people and let these naive people blog and talk about doing no evil, so that that they can continue their evil plans unnoticed.

This conspiracy theory implies that Google is a well oiled machine, a super organized corporation. My outside observations convince me that that is far from the case. I would not go as far as to say they are an organizational mess, but they are definitely not organized enough to take over the world in a massive global conspiracy.

That argument also means they can’t be all Good. They are definitely doing things that I do not feel comfortable with, even things that I would probably still disagree with if I knew all the inside info. Just use their website to find examples.

So for those people who love simple answers; just pick the answer like and stick to it, if it makes you feel better. There are always more important things in life than answering this question.

This is my reply to Umar’s question:

Can you think of instances where Google has violated this [do no evil] principle?

Probably. In some cases it is hard to decide because they won’t or can’t release all the internal discussion that went into certain decisions. We just have to take their word for it…

Is Google becoming more evil as it attains more market power?

… which becomes a very unattractive option given their current size. No matter how good their intentions, people (and especially institutions that are supposed to protect us from companies doing evil stuff) are just not going to take their word for it. You already mentioned this in a previous post. Update: no that was an article in the International Herald Tribune titled “Google The New Master of Network Effects”, second last paragraph.

Is the relationship between market power and evil set in stone – will Google inevitably become evil, because that’s what happens to companies (and people) as they grow up?

Nothing is set in stone. Google is just a decade old and probably has a hard time dealing with its own size and growth. They can grow really bad or really good. I also doubt that even Larry and Sergey have enough control over their creation to steer it in any direction, although I hope I’m wrong.

In my opinion the “best” (as opposed to “most evil”) thing Google could do is create complete openness. Ideally, they should open source *everything*, including their secret sauces and also have no more secret projects. I think the best way to keep governments happy, make people feel more comfortable and help the world move forward is to make it as easy as possible for a strong competitor to emerge. Ideally a new market player would just have to buy half a million computers, a couple of engineers and marketing team and be able to run a copy of Google, adsense included.

Now this sounds a lot like economic suicide to me, but if they keep growing like this, then if I was Neelie Kroes I would enforce this at some point in the future anyway. So Google could, like Microsoft, wait for that moment or do it themselves right now.

I am sure Larry, Sergey and a whole lot of people working at Google would love this idea, but the shareholders probably won’t like it and it might not even be legal. Unless someone can find a way to do that and make a profit out of it.

Or we can enter the realm of my even more far fetched ideas: just start abusing your power and let the government help you bypass the shareholders. I guess most people are happy that I’m not a CEO 🙂

Categories
gsoc website

RESTful

Update 17-7-2008 : “Formal” demo server now runs on apache2-mod-python

API

Not much has changed on the outside, but a lot has changed on the inside. Most importantly you can now get the altitude profile through a RESTful API. My mentor Artem Dudarev built a nice Google Maps demo that uses this API. A more formal demonstration can be found here (source code).

There is currently one way to send the route to my server, but there will be three ways. There are currently two types of output available for the profile, but there will be four. The wiki explains everything in detail, but here is a summary. It uses a short drive through Heidelberg as an example. Have a quick look in Google Maps.

Let’s start with the easiest input and output (which is not working yet by the way).

http://altitude.sprovoost.nl/profile/gchart?lats=49.407810,49.407770,49.408950,49.407040,49.406880,49.407620,49.413360,49.414800,49.414730&lons=8.681080,8.684210,8.692368,8.692670,8.693919,8.694270,8.692300,8.692110,8.693110

This simply sends a list of latitudes and longitudes via an HTTP/GET request and returns an image. You could use that in an <img> tag.

A more standards compliant way is to send some XML, an OpenLS RouteGeometry object, through an HTTP POST request to: http://altitude.sprovoost.nl/profile/gchart_url/xml/ . Check out the wiki to see what that XML should look like. To be fair, OpenLS goes a little over my head, so don’t expect a very proper implementation at this point; consider it a gesture.

A more hip way, and probably also the most data efficient and easiest to program way, is to use the recently released Protocol Buffers from Google. Encoding in Python is a simple as this:

route_pb = altitudeprofile_pb2.Route()

for p in route:  point = route_pb.point.add()  point.lat = p['lat']  point.lon = p['lon']

route_pb_string = route_pb.SerializeToString()

Here route is an array of dictionaries like {‘lat’ : 49.407810, ‘lon’ : 8.681080}. More details in wiki.

The protocol buffer string is only 108 bytes in this case, whereas the xml document was 793 bytes and the GET string uses 180 bytes.

The examples here return a Google Chart image. The server will fetch this image and then send it to the client. This is easy to use, but uses more resources on my end than strictly necessary. Therefore it is also possible to fetch the URL to the image in stead of the image itself. In stead of:

http://altitude.sprovoost.nl/profile/gchart?lats=...&lons=....

you use:

http://altitude.sprovoost.nl/profile/gchart_url?lats=....&lons=....

It is also possible to retrieve and XML document:

http://altitude.sprovoost.nl/profile/xml?lats=....&lons=....

And finally I will also support protocol buffers as an output format.

Any combination of input and output will be possible.

App Engine continued

The current version of altitude.sprovoost.nl runs on the Google App Engine. There are however a number of issues with the app engine that prevent full scale, planet wide, deployment at this stage.

The first problem is limited storage space (500 MB) during the test phase.

The second problem lies in the way the data is stored, which currently takes about 100 bytes per record. Since I store each altitude as one record that adds up. It is currently unclear whether this 100 bytes includes the key or not. Eventually Google will not count these keys towards data usage, because they prefer us to optimize for performance in stead of cost. In theory I only require 2 bytes (a smallint) per record, so the whole planet would only require about 40 GB. There is a discussion on the mailing list about this.

The third problem is uploading data. I already explained in my previous post that uploading the planet with the current tools would take 16 years. I opened a ticket for it.

The fourth problem is that the app engine can’t deal with Google’s own shiny new Protocol Buffers yet. I mentioned it on the mailing list and I’m sure that will be fixed soon.

In the mean time, I am working on getting my PostGIS version back up and running. It uses mostly the same code. I am also considering supporting MySQL, since I am not using any GIS functionality anyway.

The competition / colleges

I am not the only one on this planet who is working on altitude profiles.

The most impressive one is probably Hey What’s That. If you ever enjoyed the view from a mountain or high-rise building and wondered what you were looking at in the distance, you’ll love this. They also provide an altitude profile with a HTTP/GET API. They even take the roundness of the earth into account and they are contemplating refraction.

That reminds me of the Wikipedia article about the horizon: I drew the diagram on that page years ago because I  needed it for a paper on Islamic prayer times and refraction; as I understood it, you are not supposed to pray when the sun is exactly on the horizon, because pagans do that.  Refraction makes that issue a bit more complicated and I spent some time with a fellow student trying to figure out how Islamic scientist looked at that issue in the past. Contact Jan Hogendijk if you find this fascinating. The figure is still there, albeit with some improvements.

The Google Maps Mania blog lists even more examples of altitude profiles.

This makes it ever more important for me to focus. My focus is:

  • open source (obviously since it’s a Google Summer of Code project)
  • short distances (so I can pretend the world is ‘flat’ and hopefully don’t have to worry about sudden spikes in elevation between sampled points)
  • different methods to access the service
  • KISS (Keep It Simple Stupid)
  • prioritize quality of data over quality of graphics

Git

Git-svn has some issues with branches. I found a solution here (also read the follow ups). This is what I had to synchronize my master branch with subversion after I merged another branch into it:

Start with the svn head. I use a trunk/tags/branches structure in subversion. If you don’t, you may have to replace “trunk” with “git-svn”.

git checkout -b tempbranch trunk
# bring in all the changes from your branch
git merge --squash master
# commit with whatever message you want
git commit -m "Whatever, this message will self destruct anyway"
# and ship it to svn land
git svn dcommit
# Go back to your master branch
git checkout master
# Clean up
git branch -D tempbranch

I usually update svn less frequently than git. But if you to use a more recent version of my code and can’t use git, just drop me an email.

Categories
gsoc

Google Map Maker and OpenStreetMap – My five cents

Update 1-7 : New evidence blows my best-case scenario out of the water, but no worries (see below)
Update 1-7, on hour later : Or perhaps it does not, just keep the popcorn close

For those of you who have been sleeping for the past couple of hours, Google just released a map making application. Now everyone can add streets to Google maps. It is currently only available for a couple of countries where Google has very little map data, but I’m sure they will scale it up in the future. Many people, including me, will probably wonder what that will mean for the OpenStreetMap project.

There is a blog post on BlinkGeo about this:

Will Google play nice and make the crowd-sourced data available for use in applications other than Google Maps and in some common format (hint hint, KML)? Who ultimately should own the data?

The OpenGeoData blog is not very positive about the development. Steve writes:

The fundamental reasons for OpenStreetMap remain intact and if anything are now stronger. At first glance it sounds like OpenStreetMap, until you realise that Google own that data you give them, there’s no community and you are unlikely to see use of the data in ‘creative, productive, or unexpected ways’.

I am personally a bit more positive about the situation. My guess is that because Google set this up in a hurry to help out with the Myanmar disaster relief, they did not have time to think about copyright issues or to communicate with the OpenStreetMap community on how to prevent duplicate effort. They probably had their legal department come up with a standard data licence in a hurry; as Steve noticed:

The terms of use are hilarious – the bottom line seems to be that Google ends up owning excusively the entire aggregate work, but it is your fault if anything goes wrong.

As a side note on duplicate effort, I’ve heard several people claim that the time it took to create Wikipedia, is the time US people spend every weekend watching ads. In other words, duplicate effort in massive online collaboration projects is really not an issue yet. Even if there were a dozen projects like OpenStreetMap and Google Map Creator, each of them could still create a complete map of the world in a few days if enough people join the effort.

Personnaly, I love it when other people take care of work that I was supposed to do. That gives me time to do other things and planning is the most interesting part of most projects anyway. In my opinion, Google does a great job here for the OpenStreetMap community, with just a few as-long-as-they’s.

I like the intuitive interface and the moderation system. I managed to trace a couple of streets in the Netherlands Antilles without reading a manual. They have build in support for moderation. You can select a neighborhood that you know and get a notification when someone edits it; my edits got feedback and were rejected (for good reasons) within the hour. It distinguishes between new and experienced users and even automatically detects when you make a ‘suspicious’ edit.

I would love to see cooperation between OpenStreetMap and Google and I think that would make a lot of sense. Google can provide the OpenStreetMap community with the massive scalable infrastructure they need and OpenStreetMap can provide Google with a community, map making experience and great tools.

Here’s a possible future scenario:

  1. Google opens up the user contributed data using an API and a weekly database dump. It should provide read and write access to the buildig blocks of the street data; e.g. the nodes and ways.
  2. Google switches to a Creative Commons license or something similar. As I mentioned above, I think and hope their current data policy was made in a hurry.
  3. Google imports all existing OpenStreetMap data into their system. Only the parts that are easy at first; roads and railways.
  4. Google incorporates the Map Maker API into their App Engine. Google mentioned during the Developer Day in Sydney that they strive to make that App Engine comply with a future standard to prevent a platform lock in.
  5. People start building their own editing software or porting existing ones like Potlatch and JOSM. They can use the Google App Engine or something similar, so they do not to worry about scalability.
  6. People start building their own map viewing software or porting existing ones like Osmarender.

So in short:

  • Worst case: Google does not collaborate and we’ll have duplicate effort, which is not a problem because there is an astounding amount of disposable man power available on the planet.
  • Best case: Google does collaborate and the OpenStreetMap project will be finished in a year or so.

Update 1-7 : Google Signs Five Year Map Agreement with Tele Atlas

That definitely blows my ‘done in a hurry’ argument out of the water, but they haven’t taken over the world yet. We’re back at what I call the worst case scenario in which Google does not seem to be planning to share their users work.

As I wrote, that should not be a problem for OpenStreetMap, because there is an astounding amount of disposable man power available on the planet.

Apart from that, Google is completely dependent on income from advertisements: Google can not offer any service on a massive scale that does not generate add revenue. This provides several oportunities to the OpenStreetMap community:

1 – it can give anyone access to all data, in any format
2 – it can distribute hosting and serving of this data (Tiles@home is a good example)
3 – it can collaborate with as many partners as it wants (I don’t think Google can easily expand its Tele Altas deal to other map providers.)

The rise of Google Map maker is probably a good thing for OpenSreetMap, because it provides something real to compete against. That makes it easier to identify weak spots and increases motivation to work on them.

Update 1-7, an hour later:

Deal does not apply to Google Map Maker contributions.

You may also be interested in Ed Parsons’ post about this subject. He does not represent the official Google point of view, but he does provide a very useful insight nonetheless.

Categories
gsoc

Google App Engine – On uploading serious bulk data

“The Google App Engine enables you to build web applications on the same scalable systems that power Google applications”, says Google. Sounds like that could come in handy for my route altitude profile. The NASA SRTM dataset has 1.5 billion data points just for Australia and I have no idea how many people will use my application.

In the last couple of days, I’ve worked through the tutorial and some of the documentation and have taken the first steps (on a separate git branch) in making my application run on the App Engine.

The first challenge was – and still is – to insert the data into the data store. The data consists of 1060 tiles of 1 degree longitude by 1 degree latitude. Each tile consists of 1200 x 1200 data points representing the altitude at each location; about every 100 meters. I wanted to insert just 1 tile, which took about a minute with Postgres when using the COPY command.

Google provides a bulk uploader that can be used to upload data. However, I think the author of that script had a different idea of ‘bulk’ than I do.

It is possible to test your application offline, before uploading it; the SDK comes with an offline data store. Of course, the offline data store is not the real thing. After 10 hours, it had only inserted about 10% of 1 tile. I also noticed that after each insert, the next insert would take longer.

So I skipped ahead and tried the real thing online. The good news is that the slowdown effect had disappeared, as I expected. The bad news is that I could only insert about 100 points each time. Any more and the server would kick me. This means I would need to send an HTTP POST request 15.000 times to upload 1 tile. That would take roughly 3 days with my current connection (remember: 1 minute with Postgres). In addition, even though I used small chunks , the connection got broken by the server after a while. That means starting over again, because the bulk upload script does not have a resume function.

Clearly we need a different approach here.

One way would be to improve the bulk upload script. It should be able to resume a failed upload. But that would not speed things up.

Therefore, I think the best solution would be if Google makes it possible to upload a CSV file, so their servers can perform their equivalent of a SQL COPY command.

The next challenge will be probably be data consumption. In Postgres, I store the coordinates in a ‘special way’ as one long int (8 bytes) and the altitude is stored as an small integer (2 bytes). So in theory that requires 10 bytes per data point. In practice we need to index the coordinate and there is probably some other overhead. I estimate (it’s a bit difficult in Postgres) that Postgres uses anywhere between 40 and 87 bytes per data point. Its even harder to estimate it with the Google App data store, but it looks like that uses about 100 bytes per data point. I have no idea if that is good or bad. But at that rate, I would need about 150 GB of storage space and the test version only allows 500 MB. Imagine if I would like to upload the whole world.

I have to run to airport now (or actually to the bus to the airport); tomorrow during the Google Developer Day in Sydney, I will hopefully have the opportunity to discuss these issues with experts. I’ll keep you posted: to be continued.

Categories
gsoc

Demonstration

Example altitude profileI built a simple website that displays the altitude profile for four different example routes. But you can already get the altitude profile for any route in Australia through an XML-RPC request to:

http://bak.sprovoost.nl:8000/

And then call one of the following two functions:

altitude_profile(route) : returns an xml document

altitude_profile_gchart(route) : returns a Google Chart like the one on the left.

The route argument has to look like this:

<route>
  <point id="1" lat="61.8083953857422" lon="10.8497076034546" />
  <point id="2" lat="61.9000000000000" lon="10.8600000000000" />
  <point id="3" lat="61.9000000000000" lon="10.8800000000000" />
</route>

Please be nice to my home computer and if you find any security issues, please tell me.

So what shall I focus on next?

There’s the actual profile:

  • interpolation : when two points in a route are far apart, add extra points between them
  • better altitude estimation : currently I get the altitude from the nearest SRTM data point; I will combine information from multiple points

I hope that with these changes the profiles will look a bit more smooth.

There’s performance:

  • generate lots of example routes; I created these four routes manually. I would probably need to find a way to automatically extract a whole bunch of routes from the Google route planner or something similar.
  • stress test with these routes; I want to get some idea of what kind of hardware & bandwidth requirements are involved.
  • I would really like to get my hands on the Google Apps Engine as that would outsource part of the work to Google. Two problems there: I don’t have an API key yet and I would need more storage space then they currently allow; it would be great if I could store the whole planet, but Australia already needs 123 GB.

Security:

  • Don’t know a lot about that, except that it is very important.

Looks:

  • Something a little more pretty than the Google Chart?
  • Add some ‘landmarks’, e.g. place names, road type.

Code:

  • I’m new to Python, so I want to go over my code and make things a bit nicer: refactoring.

Last but not least, how can this tool best be integrated with other websites? One scenario that I have in mind would be a third party website that uses both the OpenRouteService and my altitude profile application. A user would enter origin and destination. Then the website sends this origin and destination to OpenRouteService (through xml_rpc?), which sends the route back, both as a map figure and as an xml document. Next the website sends this route to the Altitude Profile service (through xml_rpc), which then returns the Google Chart (or an XML document). The website then displays the map and Google Chart for the user to enjoy.

There’s many ways to go from here and although I have plenty of time left for the project, I probably can’t do everything. So what would you like to see next?

Update: More information about my project can be found here.

Categories
Australië gsoc livejournal

Internet Dependancy and the 15-Minute People

This is the second time that I am living in another country for a while and once again I have stumbled into a huge problem that I believe is massively under appreciated by many. I need the internet, but it’s not as ubiquitous as you’d think.

Has anyone seen the South Park episode about this? The entire United States flees to refugee camps on the west coast where there is still a bit of Internet. People wait in a cue all day for only 40 seconds of Internet. Does that sound a bit ridiculous? Well if you are a homeless person without a laptop (and except Japan probably, most homeless people do not have one) and want to use the Internet, you’ll have to cue up at the State Library in Melbourne for 15 minutes of Internet (it takes about 1 minute to load gmail, so do the math…). And that is the best deal in town as far as I know.

When I see these people (probably not all homeless) I get the same emotions that most people probably get when they see hungry people in Africa. Of course, that is not fair or rational, but it’s just easier to emphasize with for me. I completely depend on the Internet for almost everything I do and it allows me to live a fascinating life. But those 15-Minute People, as I call them, have to do everything the old fashioned way: more time consuming, more expensive., less opportunities. For example, they have to make a phone call to book a flight; that adds at least 15 dollars, plus you can’t easily compare fares. They have to find a place to live through newpaper adds; most rooms are shared online, so they miss those. And they don’t have access to all sorts of job search websites.

I can probably find the best price and book a flight in under 15 minutes, but most people can’t; there are many important tasks for people need to sit down and focus a lot longer. 15-Minute People have a serious disadvantage here. How about adding two hours of Internet per day, for everyone, to the list of government responsibilities? Or shall we just wait and see what happens if we don’t?

Anyway, I’ve been one those 15-Minute People a couple of times during my travels. I take the Internet for granted, but I have been without it quite a couple of times at great costs to productivity. It is even worse if the situation is unpredictable; if you do know for how long you will be connected, or how long it will take before you are connected again. This makes it impossible to plan anything.

I spent six months in Slovakia and I would have some weeks with and some weeks without the Internet. At the office! And with no way of predicting what would happen next. Not very practical if your work is Internet based. But I guess people kind of expect that sort of thing in Slovakia (although I think it does not have to be that way).

The biggest surprise in this respect is Australia. I was completely taken by surprise at the terrible state of the Internet over here. I was expecting more or less 99% household penetration of Internet, free wifi at hostels, the usual, but I found the opposite. Hostels will gladly charge you 4 dollars an hour and it won’t be wireless either. Many homes, even with young people, do not have an Internet connection. And in general the Internet is very slow, has download limits(!) and is expensive.

Just a comparison: at home in The Netherlands I pay 30 dollars a month for my Internet connection, plus about 15 for the phone line that you need to have with it. For that money, I can download at 20 Mbit and there is no limit to how much I download. The best deal I could find here, is 50 dollars for the Internet plus 20 for the phone. That gets me 0.5 Mbit download and a 25 GB limit per month. So that is 40 times slower for almost twice the price and you get a download limit as a bonus.

So what is causing this? Well, let’s just say Telstra (the former state company) blames crushing government regulations and most others blame Telstra for acting like a monopolist. I leave the choice to you.

The good news is that people are not happy with that and there is a lot of work going on to upgrade Australia’s physical, and equally important, organizational Internet infrastructure. The other good news is that Australia has a pretty cool system that you can use to quickly switch between providers and that the business of connecting and disconnecting people seems to be a *lot* faster than at home (it took only 3 hours to disconnect from our old provider; these things take weeks in The Netherlands, months in Slovakia).

I hope I will have a working 24/7 Internet connection at home by the time I get back from the Google Developers day, my Sydney excursion and the Burning Couch festival next week. That would great for productivity and peace of mind!

Categories
gsoc

Import NASA SRTM3 data into Postgres

The first official week of my ‘summer’ of code was a succes. I managed to import the NASA SRTM3 data into Postgres. That is, the import is running at about 1 tile per minute while I am writing this. The result is available through Subversion and Git.

I think I am getting the hang of Test Driven Development. It would be great if someone can point me to a Postgres Python test tool. For now, I created a test database and wrote my own functions to populate it with test data for each test and clean up afterwards. In a month I will attend the Continuous Integration and Testing Conference in Melbourne (an OpenSpace event).

I mostly use PyGreSQL to connect to Postgres, but for large inserts I wanted to use the efficient Postgres COPY function. Psycopg2 supports this with its copy_from(table, file) function. However, this function mysteriously freezes, so I had to work around that.

Next week I am going to figure out how my application wants to receive a route. My original plan was that my application receives an xml request, but I just read the announcement of OpenRouteService so I will have a look at how they do it.

Tonight I am flying to Melbourne so I have some time to read The Summer of Code Mystery Book.