David D. Smit

Overview

Accomplished engineer with 9+ years of experience in data engineering, software engineering, product development, and manufacturing engineering. I have produced data driven results through the application of various data analytics and engineering tools in conjunction with Six Sigma methodologies. I have extensive experience with process automation, new product development, vendor relationship management, root cause analysis, and corrective actions. I am highly effective in a fast paced, continuous improvement environment.

Career History

Schaeffler Group - Solutions Data Engineer

2021-07-12 to Present

I am responsible for developing data driven solutions to improve the business outcomes in manufacturing, quality, logistics, and purchasing by understanding the needs and translating them to a technical solution.

Accomplishments

    Ball Inspection Improvements

    The quality group requested an AI-based inspection process for rolling elements, resulting in a makeshift system with inadequate throughput. I automated deployments and optimized the process by migrating to FastAPI with AsyncIO and implementing TensorFlow Serving, which led to a 57% improvement in throughput.

    Faster, Automated Deployments

    I took the existing code base, which was a flask application leveraging Tensorflow directly and created a CI/CD pipeline in Jenkins to deploy to a machine that met all of the isolation requirements. Having the application automatically deployed allowed for fast experiments in the productive environment to optimize performance.

    Better Distribution of Processing

    I worked on making structural and code changes to improve performance. The first thing I did was migrate to FastAPI so I could leverage AsyncIO (many of the operations involved reading and writing files). I also removed the inference process in the FastAPI application and created a Tensorflow Serving process that the application communicated to using GRPC. After running some experiments, I determined the optimal inference batching parameters and preprocessing worker combination. I also updated the process to render the boxes on the image on request which minimized the inline cycle time (operators generally only clicked a couple images each cycle).

    The combination of batch inference, multiple application workers, and leveraging AsyncIO (so workers could be more completely utilized) allowed a 57% improvement in throughput.

    Skills Used
    PythonTensorFlow ServingJenkinsgRPCDocker/PodmanFastAPIPandas

    Made To Order Inventory Forecasting

    The business was using excel and a rolling average approach to order new material, which caused both shortage issues as well as excessive WIP (depending on the material).

    I leveraged both the Data Robot platform and statsmodels Python library to determine which models and parameters worked best for each material. Once I determined which approach works best, I leverage Azure Synapse workbooks to orchestrate and run the predictions across the various models.

    We reduced the material on-hand by 3%, which translated into a significant reduction of dollars tied up in material.

    Skills Used
    Data RobotAzure SynapseAzure Data FactoryAzure DevOpsPythonSQL

    Hack(AI)thon With Made To Order Overdue Order Prediction

    With the wide variety of materials and processes, it can be difficult to recognize there is an issue until an order is past due. This reduces customer satisfaction.

    A team consisting of members from purchasing, quality, and industrial engineering (and myself) worked for 2.5 days on the Data Robot to see if a model could be developed that identifies orders that are high risk of being overdue.

    Our team won the hackathon based on the performance of our model.

    Made To Order Overdue Order Prediction Productive Solution

    With the wide variety of materials and processes, it can be difficult to recognize there is an issue until an order is past due. This reduces customer satisfaction. We developed a solution in the Hack(AI)thon, but the process needed to be automated to provide value to the business.

    I used an Azure Synapse Notebook to prepare existing data in the data lake for consumption on the Data Robot platform. I then leveraged a production deployment on the Data Robot platform to execute a batch job to generate prediction results. We displayed these results as part of the Made To Order Production Planning Power BI report.

    The business has been able to better react to high risk orders, which has improved on-time delivery by 20%.

    Skills Used
    Azure Data LakeAzure SynapseAzure DevOpsData Robot

    Made To Order Production Improvements

    One of our plants in Fort Mill largely produces products made to order, which comes with unique challenges when compared to our more typical scheduled and batch processes.

    I built a ETL process that brought in sales and production data, and produced a report that was able to relate the data together to make it easy to understand the start of the production orders, operations, and materials for a given sales order.

    This improvement reduced the toil of gathering the data, and increased the frequency of updates from 1/day to every 2 hours, allowing the business to react more quickly to issues.

    Skills Used
    SQLAzure Data FactoryAzure DevOpsPower BI

    Globalization of the Plant Dashboard

    The Plant Dashboard developed for Region Americas provides valuable information about key operations KPI's, but it wasn't available for plants in the European and Asia-Pacific regions.

    I supported redesigning the ETL processes to be able to accommodate for the different datasets in our quality checks and larger volumes of data. I also lead the initiative for migrating multiple resources (Azure Data Factory, Azure SQL, Azure Data Factory Data Flows, Denodo) from our development environment to our productive environment.

    Our usage and engagement increased 30% and continues to increase as awareness of the tool spreads globally.

    Skills Used
    Azure DevOpsAzure Data LakeAzure Data FactoryAzure Data Factory Data FlowsSQLDenodo

    Region Americas Denodo Migration

    Our central data platform team required migrating our SAP BW ETL process from Denodo to the BW MDX connector in Azure Data Factory.

    I created a tool in python using the Typer framework that would take an MDX query generated from Denodo (and typically tweaked by me) and would generate a mapping for Azure Data Factory to ensure the column naming conventions were maintained. I also developed a set of quality verification tooling to ensure that the data matched between the new and existing systems.

    We were able to migrate all required views off of the Denodo platform without production interruption, allowing the central platform team to discontinue the existing license.

    Skills Used
    DenodoAzure Data LakeAzure Data FactoryPythonTyperMDX Query LanguageSAP

    Tableau to Power BI Migration

    The company was switching from Tableau to Power BI, and all existing reports needed to be migrated to Power BI. To help support the labor required for this migration, the original data processes needed to be migrated to sources that could be easily consumed in Power BI in an automated way.

    I migrated dozens of manual and on-prem/local processes to the cloud using resources like Azure Data Factory, Data Flows, Azure SQL server, and Azure Data Lake.

    The result was an overall reduction in labor of 500 hours per year.

    Skills Used
    Azure Data LakeAzure DevOpsAzure Data FactorySQLDenodoSAP

    Colinx Report Automation

    Several hours a day were spent on a highly manual data aggregation process that was error prone and difficult for others to perform if the key resource was out of the office.

    I leveraged Azure Data Factory and Azure Data Flows to combine SAP tables from our Data Lake to replicate the logic in the SAP transactions and sink the data into a database that could be consumed in the subsequent process.

    The improvement reduced 200+ hours of labor and democratized the process to allow others to complete the process as nescesary.

    Skills Used
    SAPAzure Data LakeAzure Data Factory Data FlowsAzure Data FactorySQL

GKN Automotive - Global Product Development Engineer

2018-11-05 to 2021-07-09

I drove V.A.V.E. improvements to reduce manufacturing costs by reducing scrap and simplifying the design of the hydraulically actuated clutch products. I also acted as the local subject matter expert for the end of line testing processes for the clutch products.

Accomplishments

    Streamlined EOL Data Analysis

    The EOL produced multiple types of data, including single point measurements on a per run basis that were stored in a SQL database against a DatabaseID as well as per run Timeseries data that was stored in CSV's on the machine and had different schema depending on the part type that were named based on the part serial number.

    I leverage python in a Jupyter notebook to extract the data from the SQL database, and analyzed this data for throughput issues (failures on the first attempt) and scrap (failures after 3 attempts). I then joined this data with another table/station so I could get a list of all of the serial numbers that I needed to extract the time series data for, and would sink this to a tab on an excel file.

    The EOL machines that stored the CSV files were air-gapped, so I had no other choice but to use an external drive and copy the files from the machine. Initially, I was either searching for the specific files by name, or I was copying all files I might need (both were time consuming processes). I eventually developed an excel VBA script that would take the list of units from the first notebook, and just copy those units to the external drive. I would have preferred to use python, but I was restricted on what I could install on the machines.

    Once I had the files copied from the machines to our fileshare, I once again used Python to plot the time series data so I could analyze what was causing the parts to fail. My tool of choice is Plotly because it was easy to turn of different series (which may be different parameters like output torque or motor speed or different runs) and zoom into specific areas of the test run(s). Would I had some notes written up about what had happened and any recommendations I had for the systems or process engineers, I would then export this file to HTML using nbconvert.

    I continued to refine this process, including implementing papermill to create a new version of the notebook each time so I could keep specific notes for those failures without manually saving new versions. I also moved some of the key metrics out to plotly dash, which allowed a more interactive view of how the processes were performing.

    I was able to increase the frequency of how often I analyzed the failures from once per week to once per day, and I was able to trim the process down to less than 30 minutes most days (including walking out to the machines). This allowed our teams to react more quickly and more correctly.

    Skills Used
    JupyterPythonPlotly ToolsSQL

    Clutch Break-In Analysis

    Due to not wearing in new clutches, we would see a change in torque characteristics between the initial and subsequent runs. The systems engineers in Germany thought this behavior had more to do with the entire system and not just the clutch, which could allow us to dramatically simplify the logic for running the clutch assemblies again in a different unti.

    WIP

    WIP

    Re-Evaluate Production Limits For Reduced Scrap

    The "slope" failures on the clutch end of line tester contributed to the highest throughput impact and 2nd highest scrap of all failures on the end of line.

    First, I worked with our systems engineering to understand what could be causing the behavior. We were able to assign a cause based on the difference in test machine and vehicle dynamics. Once we understood that the behavior was expected, I utilized jupyter notebooks, pandas, and plotly to analyze the distibution of the data and determine new limits. I leveraged the HTML export feature of Juypter to create a dynamic report that was used to drive process limit changes.

    The updated limits results in $80,000 in savings anually between the scrap reduction and the throughput improvement.

    Skills Used
    PythonJupyterPandasPlotly ToolsAssembly Process EngineeringHTML/CSS/Javascript

    Leveraged Data Analysis to Justify Change in Product

    When a pump/motor combination was "tight" or required more torque to run, it could create a "sawtooth" like behavior in the system, which would cause failures on the EOL. This behavior created the most scrap of any issue, and the 2nd most throughput issues.

    I worked with the systems engineering team in Germany to develop a software fix that would address the issue. We then ran a test to validate effectiveness by using WOW (worst of the worst) units with the updated software to show that the issue would be addressed with the change.

    One of the biggest challenges was to identify over tens of thousands of runs what the true impact for the issue was. Using tools like Jupyter, pandas, and Plotly, I was able to find and filter for highly probable units for the condition based on the single data point data in the SQL database. I then further refined these results by analyzing the time-series data with the output torque, motor speed, and motor current. The combination of these sources allowed us to identify this issue over all units, which gave us the data required to justify the change.

    Although we were able to prove the potential benefit using data, the decision was made not to move forward with the software change due to the cost associated with validating the change with the customer. Although not as effective, we were able to implement some other process changes to run the pumps for longer before EOL testing to provide lower throughput impact.

    Skills Used
    JupyterPythonPandasPlotly ToolsHTML/CSS/JavascriptSQL

    D2UG Failure Analysis and Resolution

    We had an abnormally high fallout rate for a new product we produced.

    WIP

    WIP

GKN Automotive - Advanced Assembly Engineer

2014-12-15 to 2018-11-02

I was responsible for improving existing clutch assembly processes as well as designing, validating, and implementing new assembly processes.

Accomplishments

    Completed Validation and Launch of Clutch Line 2

    A new assembly line was required to support the clutch assembly process for brand new products.

    I developed the PFMEA, Control Plan, and work instruction documentation, validated gages, and compiled and agreed to a list of required improvements while at the supplier. I worked with facilities and coordinated the installation of the equipment at our facility, and I lead the effort to get customer approval (PPAP) of the new processes for each of the new products (4 products across 3 customers).

    We were able to get full customer approval.

    Skills Used
    P.F.M.E.A.Assembly Process EngineeringPPAP

    Completed Design, Valdiation, and Launch of Clutch Lines 3 and 4

    Upcoming production volumes required additional assembly lines, and we had a long list of lessons learned that should be implemented in the new processes.

    I created a required document and supported multiple suppliers to create proposals. I then vetted the proposals based on how well they met the requirements and how accurately we could determine the requirements could be met with the information provided. I then made my recommendation to the purchasing department.

    Once the supplier was kicked off, I had design reviews weekly to ensure that everything stayed on track. We provided a P.F.M.E.A. based on the initial proposal and continued to the document while the line was designed.

    Once the line was partially operational, we started doing a run-off process. This process included testing all of the quality critical processes to ensure that we would get the results we expect. Once the line was fully operational, we completed all of the necessary validation steps to get the equivalent of PPAP approval on their floor.

    I worked with facilities and coordinated the installation of the equipment at our facility, and I lead the effort to get customer approval (PPAP) of the new processes.

    For the 2nd line, the process was essentially the same for the validation and approval at their facility and after installation.

    We were able to get full customer approval for both lines.

    Skills Used
    Assembly Process EngineeringPPAPP.F.M.E.A.

    Used Simulation to Justify EOL Process Change

    The existing EOL processes did not have enough capacity to meet current and future production demands, and the current cycle time would have required an additional 3 machines. There was an idea to better leverage the existing machine with adjustments to the design, but we couldn't prove the improvement with standard tools.

    I leverage VBA in Excel to create a discrete event simulation for the proposed change. This allowed us to better predict the expected cycle time and throughput and justify the modifications to the machine.

    We were able to meet the production demands with (1) additional machine instead of (3), dramatically reducing the capital required to meet production.

    Skills Used
    Discrete Event SimulationAssembly Process Engineering
My current favorite image of myself per my 8 yo daughter