Software development: Unit test
Based on:
Introduction
CI/CD, which stands for Continuous Integration and Continuous Deployment (or Continuous Delivery), is a set of best practices and tools used in software development to automate and streamline the process of building, testing, and deploying software. In very simple terms, CI is a modern software development practice in which incremental code changes are made frequently and reliably. Automated build-and-test steps triggered by CI ensure that code changes being merged into the repository are reliable. The code is then delivered quickly and seamlessly as a part of the CD process. In the software world, the CI/CD pipeline refers to the automation that enables incremental code changes from developers’ desktops to be delivered quickly and reliably to production.
1. Continuous Integration (CI)
Objective: Integrate code changes from multiple contributors into a shared repository frequently.
Key practices:
- Version Controls: Use a version control system (e.g., Git) to track changes and manage collaboration.
- Automated Builds: Set up automated build systems to compile code and generate executable artifacts.
- Automated Testing: Implement automated testing (unit tests, integration tests) to verify that the code changes do not introduce errors.
- Frequent Integration: Integrate code changes frequently into a shared repository to detect and address issues early.
2. Continuous Deployment/Delivery (CD)
Objective: Automate the process of deploying code changes to production environments.
Key Practices:
- Deployment Automation: Automate the deployment process to reduce manual errors and ensure consistency.
- Continuous Deployment (CD): Automatically deploy code changes to production after passing automated tests and other quality checks.
- Continuous Delivery (CD): Prepare code changes for deployment, but the actual deployment is triggered manually.
- Environment Parity: Ensure consistency between development, testing, and production environments to avoid unexpected issues
CI/CD Pipeline
A CI/CD pipeline is a series of automated steps that code changes go through from development to production. It typically includes stages such as code compilation, automated testing, and deployment.
Key Components
- Source Code Repository: A central location (e.g., Git repository) where code changes are stored.
- Build Server: Server or service responsible for compiling code and generating build artifacts
- Automated Testing Tools: Tools for running automated tests to ensure code quality.
- Artifact Repository: A repository to store compiled artifacts for deployment
- Deployment Automation: Tools and scripts to automate the deployment process.
Benefits of CI/CD
- Rapid Feedback: Early detection of bugs and issues.
- Consistent Builds: Automated builds ensure consistency across different environments.
- Faster Time to Market: Continuous deployment accelerates the release cycle.
- Reduced Manual Errors: Automation reduces the chance of human errors during deployment.
- Increased Collaboration: Continuous integration promotes collaboration among development teams.
Popular CI/CD Tools:
- Jenkins: An open-source automation server widely used for building, testing, and deploying.
- Travis CI: A cloud-based CI/CD service that integrates with GitHub repositories.
- GitLab CI/CD: Integrated CI/CD functionality within the GitLab platform.
- CircleCI: A cloud-based CI/CD platform supporting multiple programming languages.
- GitHub Actions: Native CI/CD capabilities integrated with GitHub repositories.
Software Development Methodologies
Software development methodologies play a crucial role in shaping the approach, collaboration, and efficiency of development teams. This section delves into three prominent methodologies: Waterfall, Agile, and DevOps, each with its unique principles and purposes.
Unit Testing
Unit testing is a software testing technique where individual units or components of a software application are tested in isolation. The main purpose is to validate that each unit of the software performs as designed. In data science and analytics, unit testing can be applied to functions, modules, or algorithms to ensure they produce the expected output for a given set of inputs. It helps identify and fix bugs early in the development process, improving the overall reliability and maintainability of the code.
Unit testing vs integration testing
While unit testing ensures that all units of code properly work independently, integration testing ensures that they work together. Integration tests focus on real-life use cases. They often rely on external data such as databases or web servers. A unit test, on the other hand, only needs data that is created exclusively for the test. It is therefore much easier to implement.
Advantages of Unit testing
Here is a non-exhaustive list of the advantages of unit testing that make it a vital asset in the toolbox of a good programmer:
- Time saving: Some very basic errors can become quite difficult to identify during the integration testing phase, due to the many layers of code that accumulate. However, these errors can be detected very simply, very quickly and very early in the building of the code thanks to unit tests.
- Fluidification of code changes: If you wish to bring in a modification to your code (e.g. change the regression method), it becomes very easy to verify that the function still works as expected by performing the unit test of this function.
- Improved code quality: A good approach to coding is to code unit tests before you code the units themselves. This compels one to think about all the contingencies that the unit might face. Thinking about how to code the unit renders the unit simpler and more robust later on. This approach is known as test driven development (TDD).
- Aid to the understanding of the code: Unit tests are also used by developers as explanatory documentation of each part of the code. In fact, it is very easy to understand the expected behaviour of a function by reading the associated unit test beforehand.
Weakness of the Unit testing
- However, it is impossible to test the infinite variety of contingencies that the unit might face. Passing the unit test without a hitch is therefore not a total guarantee of correct operation.
- Unit tests cannot, however, by construction, test the interaction between units.
Automated testing
- What is automated testing? : To automate unit tests, there are frameworks that will greatly facilitate the task. The developer must set the criteria for the tests he/she wishes to perform, and then the framework takes care of performing the tests automatically and providing detailed error reports.
- unitest: The basic framework for automated testing on Python is
unittest
. It is a built-in testing framework in Python. It follows the xUnit style and provides classes and methods for creating and running tests. - pytest:
pytest
is another popular third-party testing framework. It normally simplifies test discovery and execution. It's known for its concise syntax and powerful features. - nose: Similarly
nose
is another third-party testing framework that extends unittest and provides additional functionality. It's particularly useful for test discovery and running tests in parallel.
Example:
- Create
calculator.py
file. Here we will have ‘add’ function
def add(a,b):
return a+b
- Create a
test_calculator.py
file and add the code:
import pytest
from calculator import add
def test_add_positive_numbers():
result = add(2, 3)
assert result == 5
def test_add_negative_numbers():
result = add(-2, -3)
assert result == -5
def test_add_mixed_numbers():
result = add(1, -5)
assert result == -4
- This file contains test functions prefixed with
test_
. Thepytest
framework for assertions ('assert') instead of the built-inunittest
assertions. - Now run the tests using:
pytest test_calculator.py
Pytest will discover and run the tests, providing detailed output.
If we change one of the tests, it will show an error and hence we will know which test doesn’t satisfy. The green dot turned into a red because one of the checks in the test was not carried out. The error returned is an assertion error and Pytest even tells us exactly where it is. When we change the result: result = add(2,3) ==6
, it will give an error:
References
- Ensuring Data Integrity and Continuity for Machine Learning Projects.
- Jenkins documentation
- Blogpost: Creating our First Complete Jenkins CI/CD Pipeline
LLet’s connect on Linkedin: https://linkedin.com/in/arunp77