Obvious assumptions can result in bugs in software: Flask in embedded world
So the other day our team at Nokia was requested to implement web user interface for some embedded device. Usually I write in C++, but I was excited to take this task because we were able to put Python in place. Of course it's also possible to write web applications purely in C++ (e.g. with Cheerp - I encourage you to at least read about this project), but only thin back-end was required, so going with C++ didn't seem to be good choice. After cross-compiling Python to ARM (we anticipated that knowing how to do this would be common knowledge in the Internet, but again there were few resources) it was a time to pick some framework. We decided to use Flask - small cousin of Django, because
I had my own goal as well :) I didn't know Flask, so there was something new to learn.
Fast forward to a place when I was about to deploy my application on the target device (after long struggle to have everything cross-compiled). Everything was set up, so I eagerly clicked the button in the front-end (react.js-based stuff) to prove that the back-end is working as well. As you can imagine - it didn't work.
In the rest of this post I'd like to describe a reason of this failure, which I found somewhat funny.
192.168.255.*** - - [01/Jan/2004 02:00:22] "POST /************************* HTTP/1.1" 500 -
Traceback (most recent call last):
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1479, in full_dispatch_request
response = self.process_response(response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1693, in process_response
self.save_session(ctx.session, response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 837, in save_session
return self.session_interface.save_session(self, session, response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/sessions.py", line 326, in save_session
val = self.get_signing_serializer(app).dumps(dict(session))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 569, in dumps
rv = self.make_signer(salt).sign(payload)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 412, in sign
timestamp = base64_encode(int_to_bytes(self.get_timestamp()))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 220, in int_to_bytes
assert num >= 0
Ok, so normally I'm reading tracebacks in a bottom-up fashion. After looking at the last item in the traceback I instantly felt that something really weird is happening. Name of the file wasn't making things less weird as well. Items above the last one are not telling much more, so I had to dive in into source code to look for other hints.
Function int_to_bytes, at the very beginning, asserts that the input number is positive. Quite normal thing to do, right? So I started a debugger to check what value it got instead. I hunched that it would be -1 or something like that. Dark clouds were already in my mind at the time. I was even afraid that the hardware itself does not support some feature. That's not rare in embedded world, after all.
With the debugger I was able to retrieve value passed in. Apparently it was -220917729. Immediately I realized that I'm facing some issues with the time, and Python interpreter itself is not to be blamed. So I jumped into get_timestamp to see how the timestamp is created.
def get_timestamp(self):
"""Returns the current timestamp. This implementation returns the
seconds since 1/1/2011. The function must return an integer.
"""
return int(time.time() - EPOCH)
I double-checked that time.time() gives positive number. So after factoring out call to time() was obvious that the EPOCH is making the result negative. Then I needed to assemble two facts: EPOCH in this function is set to 1/1/2011 and the target device starts with time set to 1/1/2004. This yielded negative timestamp and a spectacular failure in result.
So I find this one funny, because apparently the authors didn't expect that the system time could be set to 2004-something :) Perhaps they assumed that it's not possible to run with a date earlier than the day when the library was published :)
Nevertheless I was expecting errors to appear in my configuration. You don't deploy web applications on embedded devices that often. Frankly speaking I encountered other funny problems during the day as well. One of them was with nginx, and if the time permits I'll write post about it as well. Stay tuned ;]
Note: I intentionally skipped description of internal Flask organization. In short words Flask uses ItsDangerous package to have signed session storage at the client-side. You can read more about it here.
- it's smaller than Django and we were already going to eat a lot of disk space with Python itself
- it seemed like a perfect candidate for the task, because it looked so simple
I had my own goal as well :) I didn't know Flask, so there was something new to learn.
Fast forward to a place when I was about to deploy my application on the target device (after long struggle to have everything cross-compiled). Everything was set up, so I eagerly clicked the button in the front-end (react.js-based stuff) to prove that the back-end is working as well. As you can imagine - it didn't work.
http://thenextweb.com/wp-content/blogs.dir/1/files/2015/09/fn63697_01.jpg |
Following is what I saw in logs. Some paths were masked, just-in-case. So after looking at this I thought to myself what the... It was working on my x86 machine!
In the rest of this post I'd like to describe a reason of this failure, which I found somewhat funny.
192.168.255.*** - - [01/Jan/2004 02:00:22] "POST /************************* HTTP/1.1" 500 -
Traceback (most recent call last):
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1479, in full_dispatch_request
response = self.process_response(response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 1693, in process_response
self.save_session(ctx.session, response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py", line 837, in save_session
return self.session_interface.save_session(self, session, response)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/sessions.py", line 326, in save_session
val = self.get_signing_serializer(app).dumps(dict(session))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 569, in dumps
rv = self.make_signer(salt).sign(payload)
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 412, in sign
timestamp = base64_encode(int_to_bytes(self.get_timestamp()))
File "/****...****/Python2.7.5/lib/python2.7/site-packages/itsdangerous-0.24-py2.7.egg/itsdangerous.py", line 220, in int_to_bytes
assert num >= 0
Ok, so normally I'm reading tracebacks in a bottom-up fashion. After looking at the last item in the traceback I instantly felt that something really weird is happening. Name of the file wasn't making things less weird as well. Items above the last one are not telling much more, so I had to dive in into source code to look for other hints.
Function int_to_bytes, at the very beginning, asserts that the input number is positive. Quite normal thing to do, right? So I started a debugger to check what value it got instead. I hunched that it would be -1 or something like that. Dark clouds were already in my mind at the time. I was even afraid that the hardware itself does not support some feature. That's not rare in embedded world, after all.
With the debugger I was able to retrieve value passed in. Apparently it was -220917729. Immediately I realized that I'm facing some issues with the time, and Python interpreter itself is not to be blamed. So I jumped into get_timestamp to see how the timestamp is created.
def get_timestamp(self):
"""Returns the current timestamp. This implementation returns the
seconds since 1/1/2011. The function must return an integer.
"""
return int(time.time() - EPOCH)
I double-checked that time.time() gives positive number. So after factoring out call to time() was obvious that the EPOCH is making the result negative. Then I needed to assemble two facts: EPOCH in this function is set to 1/1/2011 and the target device starts with time set to 1/1/2004. This yielded negative timestamp and a spectacular failure in result.
So I find this one funny, because apparently the authors didn't expect that the system time could be set to 2004-something :) Perhaps they assumed that it's not possible to run with a date earlier than the day when the library was published :)
Nevertheless I was expecting errors to appear in my configuration. You don't deploy web applications on embedded devices that often. Frankly speaking I encountered other funny problems during the day as well. One of them was with nginx, and if the time permits I'll write post about it as well. Stay tuned ;]
Note: I intentionally skipped description of internal Flask organization. In short words Flask uses ItsDangerous package to have signed session storage at the client-side. You can read more about it here.