Monday, December 19, 2011

Avro, what a complete waste of time

I'm responsible for evaluating the different IDLs (Protocol Buffer, Avro, and Thrift) as a unified form of communication between different services in the company. A key feature of today's IDL is backward and forward message compatibility. For example, if the client adds one more field to a message, the server should be able to take the new message and process it (while ignoring the new field). The opposite is true, where the server takes in additional fields in the message while the client does not, and the server should just assume that the field is empty.

I started with Avro because I had high hopes for Avro. It had great features that neither PB nor Thrift had (no need for field deprecation, no need for deprecation, no need to get an IDL compiler), and because it's built in to Hadoop's MapReduce. My experience with Avro began with downloading the package (version 1.6.1, the latest). I tried out an example code (phunt-avro-rpc-quickstart-avro-release-1.2.0-9-gce46e91.zip) with included two small Python codes, start_server.py and send_message.py (client). Both of them used the same IDL (mail.avpr). I got the client to send a message to the server with ease. Then, I tried the most important aspects of IDLs-- forward and backward message compatibility. I expected the server to gracefully accept old and new messages, but instead got something completely unexpected:

PATH=~/code/avro-example/avro-1.6.1/src ./send_message.py AA BB MSG
Traceback (most recent call last):
File "./send_message.py", line 56, in
print("Result: " + requestor.request("myecho", {"mymessage": message}))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 260, in issue_request
call_response_exists = self.read_handshake_response(buffer_decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 204, in read_handshake_response
handshake_response.get('serverProtocol'))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 120, in set_remote_protocol
REMOTE_PROTOCOLS[self.transceiver.remote_name] = self.remote_protocol
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 475, in
remote_name = property(lambda self: self.sock.getsockname())
AttributeError: 'NoneType' object has no attribute 'getsockname'


A well designed IDL should at least show a warning message indicating that the field is unknown (or new, etc). Nope! Avro returns with a weird socket-related error. Upon looking at the Avro library (avro-1.6.1/src/avro/ipc.py), line ~474 yields:

# read-only properties
sock = property(lambda self: self.conn.sock)
remote_name = property(lambda self: self.sock.getsockname())


So, I'm no Python expert but it's clear that self.sock does not exist, so I manually set remote_name in the constructor __init__ (meaning it's not a readonly variable anymore, but who cares) and viola, it works! Who the heck checked in this code anyways? My next attempt was the reverse: the server takes in a newer message and the client sends an older message and here's my very useful Avro message:

/code/avro-example/avro-1.6.1/src ./send_message.py AA BB MSG
Traceback (most recent call last):
File "./send_message.py", line 56, in
print("Result: " + requestor.request("myecho", {"mymessage": message}))
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 264, in issue_request
return self.request(message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 262, in issue_request
return self.read_call_response(message_name, buffer_decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/ipc.py", line 222, in read_call_response
response_metadata = META_READER.read(decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 445, in read
return self.read_data(self.writers_schema, self.readers_schema, decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 486, in read_data
return self.read_map(writers_schema, readers_schema, decoder)
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 615, in read_map
block_count = decoder.read_long()
File "/home/vm42/code/avro-example/avro-1.6.1/src/avro/io.py", line 184, in read_long
b = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found


Alright, I've had enough. What I just attempted, was a very very common test case and if there's any decent amount of unit tests, this problem would never have existed. My guess now is that there is no unit test whatsoever, and there isn't much user base because I can't find this complaint anywhere via Googling (and I can't find much Avro documentation in the first place)! I sent an email to the Avro developer team this weekend and I've yet to receive a response. I am most utterly not impressed so far.

Conclusion:
1) Save yourself some time by using something else that is battle tested. Bleeding edge (in this case) is a waste of time.
2) The only thing that matters is your hands-on experience. Marketing and bias makes Avro look amazing (dynamic features, flexibility, maintenance free, language support, ...), but it doesn't matter if it does not work TODAY.

2 comments:

  1. I also found documentation for Avro to be scarce. I also prefer to have an IDL compiler generate the boiler-plate for me, rather than writing a bunch of adaptor code myself (required for any strongly typed language, like Java/C++).

    ReplyDelete
  2. Do you have a link to your post on avro-dev ? Did you raise a bug report ? To be fair this looks to be a bug in the *python RPC* implementation and not necessarily avro's serialization features. The "amazing" features you mention have to do with serialization.. serialization != RPC. Based on http://www.cloudera.com/blog/2011/05/three-reasons-why-apache-avro-data-serialization-is-a-good-choice-for-openrtb/ and https://github.com/rfoldes/Avro-Test it would seem that schema evolution backwards/forwards compatibility is pretty good... That said it's not reassuring that the python RPC code is lacking (not much unit tests either), my guess is the Java code gets more use.

    ReplyDelete