Categories
gsoc

Weekly update route altitude profile

It’s a bit of a boring title, but it actually has been an interesting week.  Although I found myself highly distracted by some unrelated but fascinating things, I still managed to get quite a bit done.

The script that I used to download the SRTM data set and import it into a Postgres database can now deal with all continents and supports uploading a subset of a continent by means of a bounding box. I also put the md5 check sum of every tile in the source code.

Since the App Engine still has some issues, I have revived the Postgres version of my application. It is located at http://altitude-pg.sprovoost.nl and contains most of Europe as far east as Moscow and as far south as Cyprus. It runs on my home computer in The Netherlands, so please be nice to it. I use apache-mod-python for the formal demonstration website and apache-mod-wsgi combined with web.py for the altitude profile server. To make this as painless as possible, I have moved all App Engine and Postgress specific code to their own files and kept as much common functionality as possible in the main file. I can now run the development servers for both Apache and the App Engine from the same source code folder, at the same time.

I have requested more storage space on the App Engine and I am also considering a more efficient storage method. In stead of storing one altitude per record, I could store 100 altitudes per record and zip them. That would drastically reduce the total storage requirement, but at the cost of performance because I often need only about 2 out of these 100 altitudes.

I have also been a bit more active on their mailing list; it feels good to be able to answer peoples questions and at the same time it allows me to verify my own code and design. There are also some interesting albeit more philosophical discussions on the list.

I have signed and fulfilled a pledge to “spend an hour OpenStreetMapping features on Caribbean islands from Yahoo! aerial imagery and […] donate £10 to the OpenStreetMap Foundation but only if 60 other people will do the same.”. I felt like I could really use another jet-lag. The pledge is full, but who knows, if they can rally another 60 people there might be a second ticket?

Those of you who laboriously follow every commit to the OpenStreetMap subversion repository, may have noticed that I am still struggling with git-svn. I got really tired of fixing conflicts, so I unleashed the power of git-svn set-tree:

git-svn set-tree -i trunk 3cb585dca1d7fe10791312ca26125168506b61c1
git-svn set-tree -i trunk 07c9024f5ea4ce60f481b8089b61d4988e7588fa

Even the manual recommends against doing this, and you should make sure nobody else (like your mentor) has committed anything to subversion before you do this.

I find git-svn to be harder to use than it should be. I think it is trying to hard to properly translate between The Git Way and The Subversion Way. I just want the subversion repository to ‘sort of’ track my git repository. I don’t care if it has to represent the history a bit different. Just keep the code up to date. I am looking forward to this command:

git-svn just-do-it

I really think Git would benefit the OpenStreetMap community, because it reflects the decentralized nature of OpenStreetMap. With Git, there is no such thing as a central repository. People can write any code they like without having to live in constant fear of breaking the trunk with their next commit. In stead, when they build something cool or useful, they will tell their friends to pull it in and experiment with it. The person who operates a production website will only pull pieces of code that he or she considers safe and useful enough.

But the reality is that many organizations rely on subversion at the moment and have excellent reasons for not risking their operations by making an instant jump to Git. So people are not going to adopt Git very quickly as long as it is so hard to sync with subversion. But lets wait for a while and see…

I am getting better and better at keeping my git repository synchronized with the osm subversion, but I would not recommend this strategy to others.

I created a project on Google Code Hosting project for the altitude profile. Not to host the code, not even for the wiki, but just to keep a list of issues. I realize I could have applied for a place on the OpenStreetMap Trac, but I want to use Google Code Hosting for my new project: Jobtorrent. This is also the reason most of the issues point to the Git source (I do point to  subversion on the main page and the only reason I do not always point to both is that I am lazy). I will write more about Jobtorrent later; first I need to work on my Summer of Code project you know…

This list of issues should be good for continuity. Because my project does not interact with any OpenStreetMap code at the moment, I am probably the only one in the community who knows how the code works and what needs to be improved. That is a very low bus factor! (“tram factor” would be a better term in Melbourne) Now I really like the OpenStreetMap effort and I will certainly find ways to stay involved in the future, but it might be in a completely different project. Depending on circumstances, I should at least prepare for the possibility that the altitude profile project will be orphaned within a few months.

I use a personal organizing method inspired by the book Getting Things Done (David Allen) and that makes it very easy to transfer everything I am working on or thinking about to the Internet. So that is what I did.

The more difficult part is keeping it synchronized. David recommends that you never share your projects. That is, you should always keep your own lists and let nobody else touch them. Your lists must reflect what you want, or you will start to rebel against them and as a result mess up your system.

So in practice you will end up with a central list (e.g. the list of issues on Google Code) and your local copy of it. They will not be the same. There are a couple of things on my personal list that are not online (nothing ground braking, don’t worry) and my own priorities are not identical to the ones online. The online version reflects what is important for The Project, the offline version reflect what is important for me. At least in theory; as long as I am the only one working on it, it probably reflects my opinion a lot better than it ideally should.

Now I am pretty sure the average recruiter looking for a “true team player” does not like what I just said in the last paragraph.

Categories
gsoc

Is Google evil? And why the world is happy I’m not a CEO.

I have recently started following Umar Haque’s blog.

Umair Haque is Director of the Havas Media Lab, a new kind of strategic advisor that helps investors, entrepreneurs, and firms experiment with, craft, and drive radical management, business model, and strategic innovation.

He’s written a manifesto for the next industrial revolution and vouched to provide free consulting to five web 2.0 start ups that actually do something useful for a change (i.e. change the world).

He started an open discussion around the question whether Google is evil or not. I find this a fascinating question. Many people find great comfort in just answering the question with “yes” and pointing to some good examples. However I’ve met a couple of Googlers here and there, read some blogs and read the book. That convinced me that there are at least some people within the company who firmly believe in the Do No Evil motto and work very hard towards that goal.

Also, saying that Google as a whole is either evil or good implies a conspiracy. It implies that a select group of people with evil and greedy plans (I guess most people would point to the shareholders and managers because they wear suits) is in complete control of the company. Not only that, they deliberately employ naive people and let these naive people blog and talk about doing no evil, so that that they can continue their evil plans unnoticed.

This conspiracy theory implies that Google is a well oiled machine, a super organized corporation. My outside observations convince me that that is far from the case. I would not go as far as to say they are an organizational mess, but they are definitely not organized enough to take over the world in a massive global conspiracy.

That argument also means they can’t be all Good. They are definitely doing things that I do not feel comfortable with, even things that I would probably still disagree with if I knew all the inside info. Just use their website to find examples.

So for those people who love simple answers; just pick the answer like and stick to it, if it makes you feel better. There are always more important things in life than answering this question.

This is my reply to Umar’s question:

Can you think of instances where Google has violated this [do no evil] principle?

Probably. In some cases it is hard to decide because they won’t or can’t release all the internal discussion that went into certain decisions. We just have to take their word for it…

Is Google becoming more evil as it attains more market power?

… which becomes a very unattractive option given their current size. No matter how good their intentions, people (and especially institutions that are supposed to protect us from companies doing evil stuff) are just not going to take their word for it. You already mentioned this in a previous post. Update: no that was an article in the International Herald Tribune titled “Google The New Master of Network Effects”, second last paragraph.

Is the relationship between market power and evil set in stone – will Google inevitably become evil, because that’s what happens to companies (and people) as they grow up?

Nothing is set in stone. Google is just a decade old and probably has a hard time dealing with its own size and growth. They can grow really bad or really good. I also doubt that even Larry and Sergey have enough control over their creation to steer it in any direction, although I hope I’m wrong.

In my opinion the “best” (as opposed to “most evil”) thing Google could do is create complete openness. Ideally, they should open source *everything*, including their secret sauces and also have no more secret projects. I think the best way to keep governments happy, make people feel more comfortable and help the world move forward is to make it as easy as possible for a strong competitor to emerge. Ideally a new market player would just have to buy half a million computers, a couple of engineers and marketing team and be able to run a copy of Google, adsense included.

Now this sounds a lot like economic suicide to me, but if they keep growing like this, then if I was Neelie Kroes I would enforce this at some point in the future anyway. So Google could, like Microsoft, wait for that moment or do it themselves right now.

I am sure Larry, Sergey and a whole lot of people working at Google would love this idea, but the shareholders probably won’t like it and it might not even be legal. Unless someone can find a way to do that and make a profit out of it.

Or we can enter the realm of my even more far fetched ideas: just start abusing your power and let the government help you bypass the shareholders. I guess most people are happy that I’m not a CEO 🙂

Categories
livejournal

Is-a-human.com

Prove you are a human!

Categories
gsoc website

RESTful

Update 17-7-2008 : “Formal” demo server now runs on apache2-mod-python

API

Not much has changed on the outside, but a lot has changed on the inside. Most importantly you can now get the altitude profile through a RESTful API. My mentor Artem Dudarev built a nice Google Maps demo that uses this API. A more formal demonstration can be found here (source code).

There is currently one way to send the route to my server, but there will be three ways. There are currently two types of output available for the profile, but there will be four. The wiki explains everything in detail, but here is a summary. It uses a short drive through Heidelberg as an example. Have a quick look in Google Maps.

Let’s start with the easiest input and output (which is not working yet by the way).

http://altitude.sprovoost.nl/profile/gchart?lats=49.407810,49.407770,49.408950,49.407040,49.406880,49.407620,49.413360,49.414800,49.414730&lons=8.681080,8.684210,8.692368,8.692670,8.693919,8.694270,8.692300,8.692110,8.693110

This simply sends a list of latitudes and longitudes via an HTTP/GET request and returns an image. You could use that in an <img> tag.

A more standards compliant way is to send some XML, an OpenLS RouteGeometry object, through an HTTP POST request to: http://altitude.sprovoost.nl/profile/gchart_url/xml/ . Check out the wiki to see what that XML should look like. To be fair, OpenLS goes a little over my head, so don’t expect a very proper implementation at this point; consider it a gesture.

A more hip way, and probably also the most data efficient and easiest to program way, is to use the recently released Protocol Buffers from Google. Encoding in Python is a simple as this:

route_pb = altitudeprofile_pb2.Route()

for p in route:  point = route_pb.point.add()  point.lat = p['lat']  point.lon = p['lon']

route_pb_string = route_pb.SerializeToString()

Here route is an array of dictionaries like {‘lat’ : 49.407810, ‘lon’ : 8.681080}. More details in wiki.

The protocol buffer string is only 108 bytes in this case, whereas the xml document was 793 bytes and the GET string uses 180 bytes.

The examples here return a Google Chart image. The server will fetch this image and then send it to the client. This is easy to use, but uses more resources on my end than strictly necessary. Therefore it is also possible to fetch the URL to the image in stead of the image itself. In stead of:

http://altitude.sprovoost.nl/profile/gchart?lats=...&lons=....

you use:

http://altitude.sprovoost.nl/profile/gchart_url?lats=....&lons=....

It is also possible to retrieve and XML document:

http://altitude.sprovoost.nl/profile/xml?lats=....&lons=....

And finally I will also support protocol buffers as an output format.

Any combination of input and output will be possible.

App Engine continued

The current version of altitude.sprovoost.nl runs on the Google App Engine. There are however a number of issues with the app engine that prevent full scale, planet wide, deployment at this stage.

The first problem is limited storage space (500 MB) during the test phase.

The second problem lies in the way the data is stored, which currently takes about 100 bytes per record. Since I store each altitude as one record that adds up. It is currently unclear whether this 100 bytes includes the key or not. Eventually Google will not count these keys towards data usage, because they prefer us to optimize for performance in stead of cost. In theory I only require 2 bytes (a smallint) per record, so the whole planet would only require about 40 GB. There is a discussion on the mailing list about this.

The third problem is uploading data. I already explained in my previous post that uploading the planet with the current tools would take 16 years. I opened a ticket for it.

The fourth problem is that the app engine can’t deal with Google’s own shiny new Protocol Buffers yet. I mentioned it on the mailing list and I’m sure that will be fixed soon.

In the mean time, I am working on getting my PostGIS version back up and running. It uses mostly the same code. I am also considering supporting MySQL, since I am not using any GIS functionality anyway.

The competition / colleges

I am not the only one on this planet who is working on altitude profiles.

The most impressive one is probably Hey What’s That. If you ever enjoyed the view from a mountain or high-rise building and wondered what you were looking at in the distance, you’ll love this. They also provide an altitude profile with a HTTP/GET API. They even take the roundness of the earth into account and they are contemplating refraction.

That reminds me of the Wikipedia article about the horizon: I drew the diagram on that page years ago because I  needed it for a paper on Islamic prayer times and refraction; as I understood it, you are not supposed to pray when the sun is exactly on the horizon, because pagans do that.  Refraction makes that issue a bit more complicated and I spent some time with a fellow student trying to figure out how Islamic scientist looked at that issue in the past. Contact Jan Hogendijk if you find this fascinating. The figure is still there, albeit with some improvements.

The Google Maps Mania blog lists even more examples of altitude profiles.

This makes it ever more important for me to focus. My focus is:

  • open source (obviously since it’s a Google Summer of Code project)
  • short distances (so I can pretend the world is ‘flat’ and hopefully don’t have to worry about sudden spikes in elevation between sampled points)
  • different methods to access the service
  • KISS (Keep It Simple Stupid)
  • prioritize quality of data over quality of graphics

Git

Git-svn has some issues with branches. I found a solution here (also read the follow ups). This is what I had to synchronize my master branch with subversion after I merged another branch into it:

Start with the svn head. I use a trunk/tags/branches structure in subversion. If you don’t, you may have to replace “trunk” with “git-svn”.

git checkout -b tempbranch trunk
# bring in all the changes from your branch
git merge --squash master
# commit with whatever message you want
git commit -m "Whatever, this message will self destruct anyway"
# and ship it to svn land
git svn dcommit
# Go back to your master branch
git checkout master
# Clean up
git branch -D tempbranch

I usually update svn less frequently than git. But if you to use a more recent version of my code and can’t use git, just drop me an email.