Showing posts with label avro. Show all posts
Showing posts with label avro. Show all posts

Tuesday, December 20, 2011

Thrift IDL (protocol)

After the horrible experience with Avro, I considered using Protocol Buffer and Thrift for the company. Protocol Buffer's strongest point is that it is stable (not much has changed in the past few years). It is used in every single possible service in Google, it has gone through a very stringent code-review process, it has been written by the world's most seasoned and anal engineers, and thus has been well battle tested. However, I consciously passed over the opportunity to suggest Protocol Buffer to use for the company partly because I'm considered a bias party, and to suggest it will simply reinforce the idea that "Kevin is a Googler so he's obviously biased. He thinks everything coming out of Google is amazing." To be fair, I really think that Google cranks out shit end-user products most of the time (Wave, Buzz, G+, Location, Google Base, Android, etc etc...). Sometimes Google happens to make good end-user products only because Google throws a billion darts in the dark and occasionally one of the darts hits the bullseye. That's all.

I tested Thrift, and it is acceptable. In terms of feature, it is very similar to Protocol Buffer. The first thing I tested was message backward and forward compatibility. There was no problem in either case. Whereas Avro returns an error saying that message format is different, Thrift server gracefully (and correctly) disregards new message types or ignores old messages.

In Java Thrift, you can set your Thrift objects using getters and setters, which is great because if the message type changes (name or type), the Java compiler will give you an error immediately. In Java Python, you can also set your Thrift objects using the constructor and the runtime system will catch name errors. In contrast, Avro does not do any of this, so your program will just run along happily even though you're setting my_integer="Not an integer" and somewhere down the line your program crashes and you're scratching your head.

One last thing I love about Thrift: there is an asynchronous transport!!! This is exactly what powers AdSense, and allows people to easily prototype distributed computation architectures.
http://blog.rapleaf.com/dev/2010/06/23/fully-async-thrift-client-in-java/

There are a few Thrift "bugs" that should be fixed. For example, suppose you set the following as message definition:
2: string lastname = "last_default",
7: string lastname = "HO",
...

The above should signal a compiler error (e.g. "Same type name not allowed."). There are many other errors that should have signaled an error, but are not. I guess either they are too busy, too lazy, or just expect the compiler (either C or Java) to catch the error.

One other minor difference between Protocol Buffer and Thrift: In Thrift, there is no deprecation keyword. In Protocol Buffer, deprecation field compiles into Java, and the compiler will tell you the field is deprecated to allow programmers to update. It's not a big deal, but it may be a big deal for companies that keep updating contracts between two services.

In the end, my take on Avro vs. Thrift is like this. Avro is like Microsoft Zune. Zune has all the bells and whistles-- AM radio, recorder, more buttons, higher display resolution, external HD, blah blah blah. The iPod on the other hand, just does one thing. On paper, Zune is superior over iPod. On paper, Avro is superior over Thrift. But in the end, Avro just doesn't work well (no forward/backward compatibility, buggy buggy buggy and the developers don't even respond to my bug report). What looks good on paper, isn't necessarily good in practice. You can't trust everything you read. You have to play with it.

Monday, December 19, 2011

Avro, what a complete waste of time

I'm responsible for evaluating the different IDLs (Protocol Buffer, Avro, and Thrift) as a unified form of communication between different services in the company. A key feature of today's IDL is backward and forward message compatibility. For example, if the client adds one more field to a message, the server should be able to take the new message and process it (while ignoring the new field). The opposite is true, where the server takes in additional fields in the message while the client does not, and the server should just assume that the field is empty.

I started with Avro because I had high hopes for Avro. It had great features that neither PB nor Thrift had (no need for field deprecation, no need for deprecation, no need to get an IDL compiler), and because it's built in to Hadoop's MapReduce. My experience with Avro began with downloading the package (version 1.6.1, the latest). I tried out an example code (phunt-avro-rpc-quickstart-avro-release-1.2.0-9-gce46e91.zip) with included two small Python codes, start_server.py and send_message.py (client). Both of them used the same IDL (mail.avpr). I got the client to send a message to the server with ease. Then, I tried the most important aspects of IDLs-- forward and backward message compatibility. I expected the server to gracefully accept old and new messages, but instead got something completely unexpected:

PATH=~/code/avro-example/avro-1.6.1/src ./send_message.py AA BB MSG
Traceback (most recent call last):
File "./send_message.py", line 56, in
print("Result: " + requestor.request("myecho", {"mymessage": message}))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 260, in issue_request
call_response_exists = self.read_handshake_response(buffer_decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 204, in read_handshake_response
handshake_response.get('serverProtocol'))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 120, in set_remote_protocol
REMOTE_PROTOCOLS[self.transceiver.remote_name] = self.remote_protocol
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 475, in
remote_name = property(lambda self: self.sock.getsockname())
AttributeError: 'NoneType' object has no attribute 'getsockname'


A well designed IDL should at least show a warning message indicating that the field is unknown (or new, etc). Nope! Avro returns with a weird socket-related error. Upon looking at the Avro library (avro-1.6.1/src/avro/ipc.py), line ~474 yields:

# read-only properties
sock = property(lambda self: self.conn.sock)
remote_name = property(lambda self: self.sock.getsockname())


So, I'm no Python expert but it's clear that self.sock does not exist, so I manually set remote_name in the constructor __init__ (meaning it's not a readonly variable anymore, but who cares) and viola, it works! Who the heck checked in this code anyways? My next attempt was the reverse: the server takes in a newer message and the client sends an older message and here's my very useful Avro message:

/code/avro-example/avro-1.6.1/src ./send_message.py AA BB MSG
Traceback (most recent call last):
File "./send_message.py", line 56, in
print("Result: " + requestor.request("myecho", {"mymessage": message}))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 264, in issue_request
return self.request(message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 262, in issue_request
return self.read_call_response(message_name, buffer_decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 222, in read_call_response
response_metadata = META_READER.read(decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 445, in read
return self.read_data(self.writers_schema, self.readers_schema, decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 486, in read_data
return self.read_map(writers_schema, readers_schema, decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 615, in read_map
block_count = decoder.read_long()
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 184, in read_long
b = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found


Alright, I've had enough. What I just attempted, was a very very common test case and if there's any decent amount of unit tests, this problem would never have existed. My guess now is that there is no unit test whatsoever, and there isn't much user base because I can't find this complaint anywhere via Googling (and I can't find much Avro documentation in the first place)! I sent an email to the Avro developer team this weekend and I've yet to receive a response. I am most utterly not impressed so far.

Conclusion:
1) Save yourself some time by using something else that is battle tested. Bleeding edge (in this case) is a waste of time.
2) The only thing that matters is your hands-on experience. Marketing and bias makes Avro look amazing (dynamic features, flexibility, maintenance free, language support, ...), but it doesn't matter if it does not work TODAY.