Client blends U.S. IoT tech agency with Eastern European data science firm

Our client designs lactation pods for breastfeeding moms on-the-go. I never used to see the pods, but now that we’ve worked together, I spot them everywhere, especially during my many travels through airports!

While maintaining a successful relationship with a U.S. based software & IoT agency to design and build their core technology, our client (with the agency’s support) was seeking an alternative provider for their business intelligence and data science needs.

Their goals were:

  • develop a trusted relationship with highly skilled and relatively lower-cost business intelligence and data engineering firm
  • contract business analysis, consultation, data engineering implementation and SLA support
  • implement a BI solution, create internal dashboards and client-facing reports, and automate report delivery
  • automate the early detection of irregularity in the operation of the hardware

A fundamental question asked by our client is, “can business analysis and BI / data consultation be accomplished with a remote data engineering firm across distance and time zones? Could they understand our business to effectively steer our data platform and operate efficiently?” Key questions indeed!

We suggested an experimental phase: focusing on select geographies where we’ve succeeded in the past, conduct first-round interviews with BI / data science firms. These firms are still nascent in locations such as Eastern Europe and C/S America. This first phase honed-in on the key requirement: state-of-the-art business analysis and data consultation. Could it be done remotely across distance and time zones? Three firms surfaced as potential candidates, and we gained enough confidence to recommend our client move forward to a full procurement process.

The results? Although it took an extensive search, both we and our client were impressed by the level of data science talent we found, particularly in one location. And when we kicked off the project, despite the different time zones, communication between our clients team and the chosen provider was exemplary.

The relationship began with a discovery workshop, followed by an agile process of daily, short standup meetings to enable our client to iteratively deepen the analyst’s and engineers’ domain knowledge, make quick directional verifications, provide feedback, share concerns, and plan the next stages.

Solutions were developed iteratively using Google’s data warehouse platform and tools, as well as open-source technologies. One year later, their relationship has continued very positively, with a high level of satisfaction by our client.

Here are a few of the criteria TeamFound injected into the procurement process:

  • we focused on two countries in Eastern Europe where the data science profession had greater maturity
  • we sought agencies with a data science focus, rather than large outsourcing providers with data science teams
  • their English language and communication skills must be as good as U.S. based options
  • the provider had to be working with significant and large-scale data science projects for prominent European clients
  • we queried the provider’s team’s individual professional experience and sought professional certifications in cloud and data engineering

Some of the data engineering capabilities we found were:

  • Cloud solutions, especially Google Cloud Platform and AWS
  • Hadoop distributions and fully-managed cloud services
  • Spark, Hive, Tez, Flink, Beam, GC DataFlow, Jupyter Notebook, Hue for processing and analytics of data
  • BigQuery, Presto, Impala, Hive LLAP, Druid, Kylin, Elasticsearch for low latency analytics, search and queries conducted on huge data sets
  • Kafka, Kafka Connect, Debezium, Pub/Sub, Flink, Dataflow, Beam, and Spark Streaming for real-time streaming analytics
  • NoSQL, Key-Value databases HBase and Cassandra
  • Spark MLlib, SparkR, Numpy, Scikit-learn for advanced analysis and machine learning
  • Kubernetes, Docker, OpenShift, Helm for containerization
  • Databases like BigQuery, PostgreSQL, Hive, Cassandra, Elasticsearch
  • BI technologies such as Tableau on-premises, Looker in the cloud, and open-source solutions