DataScience: 3 lessons learned from my first year working with data in DC

By David Elges, Chief Information Officer, DC Government

David Elges, Chief Information Officer, DC Government

In this article, I will share with you three lessons that I learned working at the DC Government that uses data to solve the most complex problems in the District of Columbia impacting the districts most vulnerable citizens, its children. I recently completed a year working with data in the District of Columbia Government for The Child and Family Services Agency. I do not work creating complex algorithms or statistical probabilities, although I deal with this topic frequently, my role is Chief Information Officer, and my primary role is to the development and deployment strategies across the agency to drive impact, and ultimately drive a return on investment (ROI).

Lesson 1: General data integration usually causes more impact than the localized use of super-algorithms and machine learning.

The most common data problem in an organization or agency is not understanding data, but integrating multiple sources and platforms. Even in a small startup, the information is spread across countless systems, environments and spreadsheets, such as Google Analytics, Salesforce, Oracle, Google Spreadsheets, Microsoft Excel and many others. For a large (and for a small) organization, integrating all of this data into a single, centralized, environment–with proper access rights as well as permissions control–is more valuable than applying a machine learning algorithm to optimize only historical data, for example. There is a lot of inefficiency in the flow of information within a large organization, and inefficiency can be translated as time, resources and money being burned. Improving the general flow of information within a large organization creates more value than optimizing only a small sector.

Lesson 2: Data Science is about people.

In a large organization, you deal primarily with people, not machines and systems. It is people who make decisions based on what they see and their interpretation of the data, by applying a new super-sophisticated algorithm to a table with one billion records, the output must be simple enough to be understood by all and have enough value to change of habit of the person who will consume that information. In this sense, my experience with user interface and user experience has proven to be highly valuable, presenting well-constructed graphs, easy-to-understand control panels, and efficient applications, directly impacts how a person understands, evaluates, and recurrently uses a data science project.

Lesson 3: Changing habits requires a huge energy investment.

I learned much more about the success of data science projects by making software for the masses and reading on psychology and cognitive sciences, than by studying data science itself. Implementing a new data ecosystem in an organization, for example by integrating 28 data sources and 1 trillion records containing 10 years of historical information, is a big lift for anyone, but I would say that this is as big an effort as it is to make a small group,even if it is just a dozen people,with different professional backgrounds, quit their old habits, spreadsheets, emails, and move to a new workflow in a tool they have never seen. There is always the "but it was working", and “that’s the way we have always done it”, the challenge is to show that the new way will bring more productivity and precision to the team, freeing time from the staff to deal with other activities that an automated system still cannot do.

The current market is less sci-fi and more business

All of the human challenges involved in a used car sale are also involved in implementing a data science project. People have habits, preferences, power play, sympathy or antipathy and everything else that defines human relationships since we decided to get down from the trees and organize ourselves into groups. Understanding this fundamental aspect of business development can help you to be much more successful in your next projects, whether they be data science or lemonade sales.