Replicating the human brain: Deep learning in action
PyData Mallorca, 2017
Machine Learning for Digital Advertising
Outreach Digital, London, 2016
Machine Learning for Digital Advertising
PyData London, 2016
Understanding Random Forests
PyData Madrid, 2016
CART: Not only classification and regression trees
PyData Amsterdam, 2016
High Performance Python
Barcelona Python Meetup, 2015
Remote execution of Python scripts using Viri
EuroPython, Firenze, 2011
Google Summer of Code Overview Panel
DjangoCon, Portland, OR, 2009
Bank of America Merrill Lynch
Bank of America Merrill Lynch is one of the largest financial institutions in the world. As part of the Middle Office Technologies team, my responsibilities are to develop software to reconcile information among different systems.
Logitravel is an online travel agency specialising in holidays, offering a large range of travel products and services at great prices: packages holidays, hotels, flights, car hire and more..
Logitravel was founded in 2004 by a team of talented web programmers with experience in the travel industry. From the very beginning the company has aimed to satisfy the needs of a growing niche within the market by providing an aggressive price policy and a wide range of products, all with great customer service.
Logitravel is within the top three best online travel agencies in Spain and also one of the fastest growing in Europe. Operating in 7 countries, with a turnover over €500M, Logitravel has a team of over 300 young and skilled multicultural employees.
The project consisted on the usage of data mining and machine learning techniques, applied to marketing analytics for hotel room sales.
It involves analyzing the patterns on user behavior when looking for hotels, estimating the probability of a specific user to book a hotel room using machine learning models, and optimizing the revenues and profits in online marketing campaigns.
Unilever is a British-Dutch multinational consumer goods company co-headquartered in Rotterdam, Netherlands, and London, United Kingdom. Its products include food, beverages, cleaning agents and personal care products. It is the world's third-largest consumer goods company measured by 2012 revenue, after Procter & Gamble and Nestlé. Unilever is the world's largest producer of food spreads, such as margarine. One of the oldest multinational companies, its products are available in around 190 countries.
The project consisted in the development of a platform for supporting the sales team in marketing tasks. The application was web based, but designed for offline use, by using Google Chrome local storage API. The application contained a catalog of products, a dessert menu designer, and a back-end to manage the worflow to print the menus (acceptance/denial, generation of Adobe InDesign drafts, budget control, etc).
The Nippon Telegraph and Telephone Corporation, commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 65th in Fortune Global 500, NTT is the third largest telecommunications company in the world in terms of revenue.
The project involved the development of internal tools to automate common sysadmin tasks
Main tools developed were:
A Python distributed system, Viri, to automate tasks such as backups, monitoring, updates, maintenance, information gathering, etc. in a large numbers of computers
A Django interface to manage hardware and software inventory
Automated reporting system for monitoring, and database reports
La Caixa, formally Caixa d'Estalvis i Pensions de Barcelona (Spanish: Caja de Ahorros y Pensiones de Barcelona), is Europe’s leading savings bank and Spain's third largest financial institution, with a network of over 5.800 branches, more than 9,500 automated teller machines, a workforce in excess of 31,900 and more than 13 million customers.
La Caixa foundation owns several leading museums in Spain, including CosmoCaixa (science museums), CaixaForum (exhibition museums), in the main cities, including Madrid, Barcelona. This project consisted in the development of a website to promote the educational activities offered in these museums.
Implementation of additional i18n features on Django
The problem
While Django provides an amazing system to translate texts, and displays localized dates in some parts of the admin; it has many data that could be internationalized, not it's not yet.
The information that developers should be able to localize/translate is mainly:
All dates and related information (times, calendars...)
All numbers (mainly decimal ones)
Texts (and any data in general) saved on the database
The proposed solution for improving Django i18n includes several different tasks. Those tasks are:
Import locale data from CLDR
Apply i18n to Django dates and times
Apply i18n to Django numbers
Allow translating content on the database
Fix already reported bug about i18n
Next are the details for every task. Note that all those specifications are subject to change, according to discussions with the mentor of the project, Django core developers team, and the main Django community.
Importing locale data
The main repository of locale data is the Common Locale Data Repository (CLDR) by the Unicode Consortium http://cldr.unicode.org/. It provides a set of XML files with information such as date, time and number formatting for most languages.
The idea of this task would be to create a python script (probably as a django-admin command), that will extract all necessary data from those XML files and put it into configuration files on the Django structure. This information will be used by Django to internationalize data on applications.
The idea of this script is to be used just by Django developers. It would mainly be a one-time execution script, and then it would be executed just when new locales are added (are some are changed).
All information gathered from CLDR files could be saved on django/conf/locale/{language_code}/formats/django.po
Specific settings imported from CLDR could be (with English localized example):
There are some locale based parameters that already exist on Django, on translation files (LC_MESSAGES) and could be deprecated on future releases of Django (when breaking backward compatibility). Those are:
DATETIME_FORMAT
DATE_FORMAT
TIME_FORMAT
YEAR_MONTH_FORMAT
MONTH_DAY_FORMAT
For keeping the system flexible, existing default values on settings will be kept. Probably it would be worth to add new ones for the new customizable formats.
Dates, times and calendar i18n
All dates and times displayed using Django should use the format defined for the current session locale. This is already implemented for some dates, like the ones displayed in admin's lists. Also a filter for formatting dates already exists in templates, which, together with the formats in the translation files, can do the job. But the good way to do that would be displaying the date by default on the session locale.
All Django forms (including admin forms) should accept the short date/datetime format of the current locale. Now it's possible to define the accepted formats using parameters of the widget, and this can be kept, but at least support for entering data formatted in current locale should be added. ISO and/or English locale can be kept as well. Existing data on input fields should be displayed in current locale too.
As Django 1.0 series is maintaining backward compatibility, those changes have to be implemented being compatible with existing behavior by default.
The calendar on admin's date/datetime field should also be displayed according to user session locale.
So basically those are the main tasks required for internationalizing Django dates:
Format all python date/datetimes objects using locale settings when converted to string to be displayed. Basically it means models.DateField and models.DateTimeField values on model instances.
Change input widgets to display data and to allow entering data on the format of the current locale.
Display admin calendar starting weeks on the day defined for current locale.
With those changes next tickets would be fixed:
#1061 About first day on calendars
#5526 About accepting non-English formats on input widgets
#6231 About the output format of the SelectDateWidget
#6449 About default format of displayed dates
#6483 About supporting European dates on javascript routines
#7509 About supporting different formats on SplitDateTimeWidget
#7656 About inheriting i18n features of AdminDateWidget
Number i18n
Right now, Django doesn't provide anything for localizing numbers on applications. All numeric values within Django applications are formatted using American formats. Users from many countries are not used to dealing with the American format, and a simple shop using Django can create confusion among users who, for example, expect the comma to be the decimal separator, and they find the point on prices.
As for the previous section, changes must be applied keeping backward compatibility.
So Django should display, and use by default the language of the current locale to format numbers. Basically that means:
Format numbers on templates using current session locale
Display and allow entering data using session locale on input widgets
With those changes next ticket should be fixed:
#3940 About comma as decimal separator
Fix i18n bugs
There are many bugs already accepted on Django trac, that would be fixed on this Summer of Code. A better review will be done, but some of them could be:
#3907: LocaleMiddleware allows languages not supported by Django