jump to navigation

A Close Look at the 12 Factor App Methodology November 16, 2023

Posted by ficial in software, techy.
Tags: , , ,
add a comment

The 12 Factors are a set of principles for building scalable, maintainable, and efficient software applications, especially in the context of cloud-computing. The 12 Factor methodology was established back in 2011, but continues to be highly relevant today. This article takes a close look at each of the 12 factors, exploring what it is, how it might be implemented (especially in a cloud-based context), an example of it, and a couple of the largest pitfalls or challenges around that factor with some ways to mitigate them. While a number of these are just The Way Things Are Done these days, an examination of the particular problems they solve offers context and a deeper understanding which can help you apply them more effectively.

In very brief form, the 12 factors are:

  1. Codebase: One codebase tracked in version control, with many deployments.
  2. Dependencies: Explicitly declare and isolate dependencies.
  3. Config: Associate configuration to the environment.
  4. Backing Services: Treat backing services as attached resources.
  5. Build, Release, Run: Strictly separate the build and run stages.
  6. Processes: Execute the app as one or more stateless processes.
  7. Port Binding: Export services via port binding.
  8. Concurrency: Use the process model to scale the application horizontally.
  9. Disposability: Maximize robustness with fast startup and graceful shutdown.
  10. Dev/prod Parity: Keep development, staging, and production as similar as possible.
  11. Logs: Treat logs as event streams
  12. Admin Processes: Run admin/management tasks as one-off processes.

If you’re interested in a more general overview of 12 Factors, please check out my article Understanding the 12 Factor App Methodology — Why We Build Cloud Applications the Way We Do. Overwise, let’s get into it.

1. Codebase

Almost all development these days follows the Codebase factor, but I’m covering it in detail for completeness, and because there are sometimes nuances and reasons that people miss even when they follow this principle. The Codebase factor promotes maintaining a single codebase for an application, regardless of the number of deployment environments. This approach ensures consistency, as all changes are tracked in one central repository, greatly simplifying version control and reducing the risk of discrepancies between different environments. It also enhances collaboration among team members, as everyone works on a single source of truth, facilitating easier code reviews and shared understanding of the application. Implementing and adhering to the Codebase factor ensures a streamlined and consistent development process, greatly reducing operational complexities and improving the maintainability of the application.

Implementing the Codebase Factor

Start by hosting the application’s source code in a version control system like Git. Ensure that any change to the application, whether it’s a new feature, bug fix, or configuration adjustment, is committed to this repository. For different deployment environments (like testing, staging, production), use the same codebase but deploy it with environment-specific configurations. This can be achieved using environment variables or separate configuration files that are not tracked in the code repository. The deployment pipeline should be set up in a way that it can deploy the same application from the central repository to any environment with the necessary adjustments.

Example of the Codebase Factor In Practice

A typical example is a web application developed on GitHub. The team uses the same GitHub repository to manage all the application’s code. When new features are developed, they are first tested in a development environment, then in staging, and finally deployed to production. Despite these environments being different, the source code deployed in each is identical and comes from the same repository. Any configuration differences are handled through environment variables.

Pitfalls and Complexities of the Codebase Factor

Managing branches in version control can become complex. This can lead to confusion about which branches are deployed in which environments. To mitigate this, select a clear and consistent branching strategy, like Git Flow or Trunk-Based Development, and maintain rigorous discipline about merging and deploying branches. Automate as much of this process as possible to reduce human error.

Sometimes, developers might be tempted to include environment-specific code or hacks in the codebase to address issues in specific environments. Mitigate this by strictly enforcing the separation of code and configuration. Use feature flags or environment variables to handle environment-specific behavior. Regular code reviews can help catch violations of this principle, but should not be relied upon as the sole mechanism to do so. 

2. Dependencies

The Dependencies factor emphasizes explicitly declaring and isolating dependencies, ensuring that the application does not rely on implicit existence of system-wide packages. This approach guarantees that the application runs consistently across all environments, as it carries all necessary dependencies with it. It also enhances the portability of the application and reduces the likelihood of conflicts between different projects running on the same system, as each project manages its own dependencies. By diligently managing dependencies, developers can ensure that their application behaves consistently and securely across all environments, significantly reducing “it works on my machine” type issues.

Dependencies Factor vs Convention-Over-Configuration

Convention-over-configuration is a paradigm that emphasizes the reduction of the need for explicit configuration by providing sensible defaults, and may at first seem to be in conflict with the “explicitly declaring and isolating dependencies” aspect of the Dependencies factor. In reality, they complement each other. Convention-over-configuration streamlines development by providing sensible defaults, reducing the need for extensive configuration. However, it doesn’t eliminate the necessity for specific configuration choices, especially in cases where the defaults don’t suffice. The Dependencies factor plays a crucial role here; by requiring explicit declaration, it ensures that an application’s unique dependency needs are met, while still adhering to the broader framework of sensible defaults. This explicit declaration doesn’t contradict the convention-over-configuration; rather, it provides the necessary specificity and control within the boundaries of a framework’s conventions. In essence, while convention-over-configuration sets the general path for development with defaults, the Dependencies factor allows developers to clearly define and manage specific external libraries their applications need, ensuring both efficiency and precision. For example, a Ruby on Rails project starts with a set of sensible defaults but allows (and expects) the developer to explicitly define additional gems (libraries) in the Gemfile.

Implementing the Dependencies Factor

To implement the Dependencies factor, use dependency management tools specific to your application’s programming language. For example, if you’re working with a Python project, you would use tools like pip and a requirements.txt file, or Pipenv, which allow you to explicitly declare all the libraries your application needs. These tools enable you to specify exact versions of each dependency, ensuring that your application doesn’t unexpectedly break due to an updated dependency. Moreover, during deployment, instead of relying on pre-installed libraries on the server, use a build step that installs dependencies as defined in your configuration file. This ensures that your application has everything it needs to run, independent of the server’s state.

Example of the Dependencies Factor In Practice

In a Node.js application, dependencies are managed using a package.json file. This file lists all libraries the application requires, including specific versions or version ranges. When the application is deployed, running npm install or yarn install in the project directory installs these dependencies as specified, ensuring that the application has all its necessary libraries, regardless of the deployment environment.

Pitfalls and Complexities of the Dependencies Factor

When multiple dependencies require different versions of the same sub-dependency, it can lead to conflicts that are hard to resolve. Use dependency resolution tools provided by your package manager, and regularly update dependencies to maintain compatibility. Tools like npm audit or pip check can help identify and resolve conflicts.

Relying on outdated dependencies can expose the application to security vulnerabilities and compatibility issues. Regularly review and update dependencies to newer, secure versions. Automate this process with tools like Dependabot, which can automatically create pull requests to update dependencies.

3. Config

The Config factor, which advocates for storing configuration in the environment rather than in the code, has a major impact in two critical areas. Firstly, it enhances security by keeping sensitive information like database credentials and API keys out of the codebase, reducing the risk of exposing them in version control systems. Secondly, it increases the portability and flexibility of the application, as the same codebase can be used across different environments (e.g. development, staging, production) with environment-specific configurations. This separation simplifies the process of updating configurations without needing to modify the code, enabling easier and safer deployment practices.

Implementing the Config Factor

Implementing the Config factor involves storing all configuration data – which might include database URLs, external service credentials, and application settings – in some system independent of the code of the application, such as environment variables, config files (that are not in the application’s codebase), config servers, secrets management systems, etc.. These settings are then accessed by the application at runtime. In development, tools like dotenv can be used to manage these variables locally without hardcoding them into the codebase. In production, these configurations can be set in the server’s environment, in a container orchestration tool like Kubernetes, or using a cloud provider’s configuration service. The key is that the codebase remains the same across all environments, and the environment-specific configurations are injected at runtime from these external sources. By effectively managing configurations as per the Config factor, applications become more secure, flexible, and easier to manage across various deployment environments.

Example of the Config Factor In Practice

A web application requires a database connection to operate. Instead of hardcoding the database URL and credentials into the application code, these are stored as environment variables (like DATABASE_URL). When the application runs, it reads this variable to establish the database connection. In development, this variable might point to a local database, while in production, it points to a production database server.

Pitfalls and Complexities of the Config Factor

Juggling different configurations for multiple environments can become complex and error-prone. Use configuration management tools or services (like HashiCorp Vault or AWS Parameter Store) to centralize and manage configurations securely. Implement checks to ensure that the correct configurations are loaded for each environment.

There’s a risk of inadvertently exposing sensitive configuration data, especially when dealing with multiple team members and environments. Limit access to environment configurations based on roles and responsibilities. Regularly audit and rotate sensitive information. Additionally, consider using service accounts with limited permissions for different parts of your application.

4. Backing Services

The Backing Services factor means treating all external services used by the application (like databases, messaging systems, or caching systems) as attached resources, which can be easily replaced or modified without changes to the application’s code. (“attached resource” refers to any external service that an application consumes as part of its normal operation;  resources are considered ‘attached’ because they can be easily connected, disconnected, or swapped out by altering configuration settings without necessitating changes to the application’s core codebase) This approach enhances modularity and flexibility, allowing for easy swapping or upgrading of services without impacting the application’s core functionality. It also simplifies the deployment and scaling of applications across different environments, as the connection details to these services can be modified without altering the application logic.  By treating backing services as attached resources, applications gain a high degree of flexibility and adaptability in their operational environments, facilitating easier management and scalability.

Implementing the Backing Services Factor

Implementing the Backing Services factor involves designing and implementing the application in a way that any service or external system it interacts with is referenced through configurable parameters. This means that the application doesn’t hardcode the details of how to connect to these services, but rather dynamically reads them from its environment.

Example of the Backing Services Factor In Practice

Consider an application that uses a PostgreSQL database. Instead of hardcoding the database URL and credentials in the application, these details are stored in environment variables. When the application needs to switch to a different PostgreSQL instance, perhaps for staging or production, the only change required is updating the environment variables with the new database’s connection details. The application itself requires no changes, as it always refers to these variables for database connections.

Pitfalls and Complexities of the Backing Services Factor

When switching backing services, there can be compatibility issues, such as different data formats or query languages. Use abstract layers or adapters in your application architecture to handle different services’ idiosyncrasies. One example of this is using ORM (Object-Relational Mapping) tools for databases, which can smooth over differences between database systems.

Managing and securely storing the configuration for different services, especially in multiple environments, can become complex. Use configuration management tools and services like HashiCorp Vault or AWS Parameter Store to securely manage service configurations. Ensure that these configurations are regularly reviewed and audited for security and consistency.

5. Build, Release, Run

The Build, Release, Run factor emphasizes strict separation between the stages of building, releasing, and running an application. This clear separation enhances consistency and reliability, as it ensures that the application is built and deployed in a predictable and controlled manner. It also facilitates better version control and rollback capabilities, allowing teams to quickly respond to issues in production by rolling back to a previous release. By adhering to the Build, Release, Run factor, organizations can achieve a more streamlined, reliable, and efficient deployment process, enhancing the overall stability and agility of their software delivery lifecycle.

Implementing the Build, Release, Run Factor

Implementing this factor involves establishing distinct and well-defined stages in the deployment process. The build stage involves compiling code and incorporating dependencies to create an executable package. In the release stage, the build is combined with the current environment’s configuration to create a versioned release. Finally, the run stage is where the application, as defined by the release, is executed in the target environment. Automation plays a key role in ensuring the integrity of these stages. Continuous Integration (CI) and Continuous Deployment (CD) tools can automate the build and release processes, ensuring they are repeatable and consistent. The use of containerization technologies like Docker can also aid in encapsulating the build stage, ensuring consistency across different environments. Kubernetes, as a container orchestration platform, strongly aligns with the Build, Release, Run factor by facilitating the consistent deployment and management of applications across various environments. It enables the clear separation of these stages through containerization — where the build stage produces container images, the release stage is managed through Kubernetes’ deployment configurations, and the run stage is executed as containerized instances managed by Kubernetes, ensuring a reliable and scalable execution environment.

Example of the Build, Release, Run Factor In Practice

A web application uses a CI/CD pipeline for its deployment. The build stage occurs when a new commit is made to the main branch, triggering the CI tool to compile the code and build a Docker image. In the release stage, the Docker image is tagged with a version and combined with production configuration settings. Finally, in the run stage, this Docker image is deployed to a Kubernetes cluster, where it starts serving traffic.

Pitfalls and Complexities of the Build, Release, Run Factor

There can be a risk of configuration drift between environments, where the release stage doesn’t accurately reflect the production environment. Use Infrastructure as Code (IaC) to manage and version your configurations, ensuring consistency across all environments.

As the number of releases increases, managing and tracking these versions can become complex, especially in rollback scenarios. Implement robust versioning strategies and maintain a detailed changelog. Ensure that your deployment tools support easy tracking and rollback of releases.

6. Processes

The Processes factor means executing applications as one or more stateless processes, ensures scalability and resilience in software deployment. By maintaining statelessness, applications can easily scale horizontally, as each process can be independently started, stopped, or multiplied without impacting others. This approach also improves fault tolerance, as any single process failure has a limited impact, and recovery involves simply starting a new process, not restoring state. Implementing the Processes factor allows for scalable, robust, and flexible application architectures, crucial in cloud-based and distributed environments.

Implementing the Processes Factor

Design your application to be stateless, meaning it should not store any session or user data internally between requests – instead, any persistent data should be stored in external services like databases or caching systems. Ensure that each process is ephemeral and can be stopped and started at any time without loss of data. Load balancers can be used to distribute requests across multiple instances of the application. In containerized environments you can manage these stateless processes effectively by defining each process in its container and using the orchestration tool to manage their lifecycle. Additionally, using session management solutions like Redis for storing session data externally can help maintain statelessness.

Example of the Processes Factor In Practice

Web requests are handled by AWS Lambda functions. Each Lambda function acts as an independent, stateless process. When a request comes in, for instance, from an API Gateway or a direct invocation, AWS Lambda automatically initializes and executes a function instance to handle the request. These function instances are entirely independent of each other; they do not share any in-memory state and rely on external services for any persistent data storage needs.

More specifically, consider a Lambda that handles user login requests. Each invocation of the Lambda function processes the login request and might verify user credentials against a user database managed in Amazon RDS or Amazon DynamoDB. The function’s statelessness is evident in that it doesn’t store any session data internally; it authenticates each request independently, possibly generating a token (like JWT) for client-side session handling, or stores session information in an external store like Amazon ElastiCache.

Pitfalls and Complexities of the Processes Factor

While the application itself is stateless, managing external state (like user sessions or persistent data) can be challenging. Use robust external storage solutions like SQL databases, NoSQL databases, or distributed caches (like Redis or Memcached) to manage state. Ensure that these services are highly available and scalable.

In distributed environments, maintaining user sessions across multiple stateless processes can be complex. Implement centralized session management using external session stores or token-based authentication (like JWTs) which don’t require server-side session storage.

7. Port Binding

Port binding focuses on enhancing the independence and portability of applications by allowing them to run in diverse environments without reliance on external web servers. Port binding enables independence from external web servers by allowing applications to internally manage their network communications, essentially embedding the web server functionality directly within the application itself. In traditional web application architectures, an application often relies on an external web server (like Apache or Nginx) to handle HTTP requests and then pass them to the application. That setup requires configuring and maintaining both the external web server and the application, tying the application’s deployment and operation to the specifics of the web server environment. An application that implements port binding typically uses a library or framework that can listen to and handle HTTP requests directly. For instance, in a Node.js application, the Express.js framework can create an HTTP server that listens on a specified port. Similarly, Java applications can use embedded servers like Jetty or Tomcat.

The self-sufficiency of port binding facilitates easy and consistent deployment across development, testing, and production environments. This approach is particularly well-suited for containerized environments, where each container typically runs a single process. The application, serving as its own web server, can be packaged into a container, listening on and exposed through a specified port. This setup aligns perfectly with the container model, where each container is an isolated environment with its own network namespace. By adhering to the port binding factor, applications can be more easily deployed, scaled, and managed, in modern, distributed, and containerized environments, enhancing overall deployment flexibility and efficiency.

Implementing the Port Binding Factor

Implementing the Port Binding factor involves configuring your application to act as its own server, binding to a specified port to handle incoming requests. This is typically achieved using libraries or frameworks within the application that can listen on a network port (e.g., Express.js for Node.js, Flask for Python). In containerized deployments you specify in the configuration (e.g. Dockerfile) which port the application listens to. This port is then mapped to a port on the host machine, allowing external access. In orchestration platforms, services are defined to manage the exposure of these ports to the outside world, handling routing and load balancing.

Example of the Port Binding Factor In Practice

A Node.js application uses Express.js to create a web server that listens on port 3000. When this app is containerized with Docker, the Dockerfile includes an instruction to expose port 3000. During deployment, this internal port is mapped to an external port on the Docker host, making the application accessible outside the container.

Pitfalls and Complexities of the Port Binding Factor

Especially in development environments, multiple applications might try to bind to the same port, leading to conflicts. Use environment variables to dynamically assign ports, or utilize development tools that automatically manage port allocation to prevent conflicts.

Managing how different services communicate in a distributed system, like a microservices architecture, can become complex. Use service discovery and orchestration tools that manage inter-service communications and abstract away the complexity of port management across multiple services and hosts.

8. Concurrency

The Concurrency factor in the 12 Factor App methodology advocates for scaling applications by running multiple instances of small, stateless processes. This approach enhances scalability and resilience, as it allows for efficient load distribution and quick recovery from failures by simply starting new instances. It also meshes very well with cloud-native practices by enabling the elastic scaling of applications, adapting to varying loads by dynamically adjusting the number of running processes or instances. The Concurrency factor is essentially a conceptualization of horizontal scaling, as it focuses on scaling applications by running multiple, simultaneous instances of processes. This approach to concurrency, wherein additional instances are added to handle increased load, mirrors the core principle of horizontal scaling, which expands capacity by connecting multiple hardware or software entities to work as a single system. By following the Concurrency factor, applications can achieve significant scalability and resilience, ensuring they can handle varying loads efficiently and maintain high availability. 

Implementing the Concurrency Factor

Implementing the Concurrency factor involves architecting the application to be stateless and capable of running as multiple independent instances, and the earlier factors help achieve this. These instances should not rely on shared in-memory state, ensuring they can be started, stopped, and scaled independently. In practice, this is often achieved using containerization technologies, where each container runs a separate instance of the application. Container orchestration tools like Kubernetes or Docker Swarm are then used to manage these containers, handling scaling based on predefined rules or metrics (like CPU usage or number of requests). In non-containerized environments, process managers can be used to spawn and manage multiple instances of the application. The key is to ensure that each process is identical and can handle a subset of the total workload independently.

Example of the Concurrency Factor In Practice

A RESTful API service is containerized using Docker. Each container runs an instance of the API service. Kubernetes is used to orchestrate these containers, automatically scaling the number of containers up or down based on the number of incoming requests. As traffic increases, Kubernetes starts more containers to handle the load, and as it decreases, it reduces the number of containers, thereby efficiently managing resources.

Pitfalls and Complexities of the Concurrency Factor

Stateless processes can struggle with tasks that inherently require state, like user sessions or complex transactions. Use external services for state management, such as databases for persistent data or caching systems like Redis for session data.

When scaling horizontally, managing communication between processes can become complex, especially in a microservices architecture. Use message queues (like RabbitMQ or Kafka) for inter-process communication, which decouples the processes and allows them to communicate asynchronously.

9. Disposability

The Disposability factor emphasizes quick startup and graceful shutdown of applications, which greatly enhances their robustness and resilience. This approach ensures that applications can handle sudden surges in traffic by rapidly scaling up and later down, while maintaining data integrity and minimizing issues during deployments and server reboots. Additionally, disposability facilitates efficient and reliable continuous deployment and elasticity in cloud environments, as instances can be easily and safely added or removed without disrupting the service. By embracing disposability, applications become more agile and robust, capable of adapting quickly to changes and recovering gracefully from failures, a critical requirement in dynamic, cloud-based environments.

Implementing the Disposability Factor

Implementing disposability involves designing the application to start up quickly and shut down gracefully. For quick startups, the application should minimize its initialization time, doing as little as possible before it starts accepting requests. Techniques like lazy loading of resources and avoiding lengthy pre-computations during startup can be helpful. Beyond lazy loading, one might use background jobs or asynchronous tasks that handle heavy computations separately and in parallel to the main application startup process. This way, the application becomes available to handle requests more quickly, while any complex or time-consuming computations are processed in the background, reducing the initial startup time. For graceful shutdowns, the application should be designed to handle termination signals, allowing it to complete current requests or transactions before stopping. This can be managed through proper signal handling in the application code. In containerized environments, container orchestration tools facilitate disposability by managing the lifecycle of containers, ensuring that they can be started and stopped rapidly in response to scaling needs or deployment updates.

Example of the Disposability Factor In Practice

In a microservices architecture, each microservice is containerized using Docker. When the system experiences high load, the Kubernetes quickly spins up new containers (instances of microservices) to handle the load, demonstrating quick startup. When updating a service or during scale-down, Kubernetes sends a termination signal to the containers, which then gracefully shut down by completing current requests and saving necessary state.

Pitfalls and Complexities of the Disposability Factor

Gracefully handling state, especially during unexpected shutdowns, can be complex and might lead to data loss or corruption. Implement robust error handling and state persistence strategies. Use external systems like databases or caches to manage state, ensuring that no critical information is lost during shutdowns.

Quick startups can be hindered if the application depends heavily on external resources or services that are slow to initialize. Optimize interactions with external resources. Consider techniques like caching or asynchronously loading non-critical resources to speed up startup times.

10. Dev/prod Parity

This factor focuses on minimizing the differences between development, staging, and production environments, significantly reducing the occurrence of bugs that are environment-specific. This parity helps in achieving more reliable deployments and a smoother development process. It also accelerates development cycles, as the feedback loop is tightened with developers able to identify and address issues that would otherwise only appear in production. By striving for dev/prod parity, teams can mitigate the dev vs production functionality issues, ensuring smoother transitions in the release process and more reliable software delivery.

Implementing the Dev/prod Parity Factor

Achieving dev/prod parity is in my experience the most challenging and multi-pronged of the 12 Factors. When talking about dev/prod parity, these are the factors to consider, along with some ways to implement parity for that factor:

  • Codebase: The same code should be used in both dev and prod. This includes using the same version control repository and branch strategies to ensure consistency. Use a version control system like Git and have a consistent branching strategy across all environments. Ensure all changes are committed and merged into the mainline branch, which is then used for both development and production deployments.
  • Backing Services: Services like databases, caching systems, and message queues should be the same or similar in both environments. Differences in service versions or types can lead to unexpected behavior in production. Similar to Configuration below, your dev and prod services will actually be different (do not connect your dev environment to the production database), but should be functionally the same in terms of behavior and kinds of data. Use the same types of services (like databases, messaging queues) in both environments. In development, you can use containers to replicate the production services, ensuring version and feature parity.
  • Configuration: Configuration settings (e.g., database URLs, API keys) should differ between dev and prod but be managed the same way. Use a configuration management tool or system to manage and apply these configurations consistently across environments. Store configuration in environment variables, configuration files that are not checked into version control, or config services.
  • Dependencies: All external libraries and dependencies should be consistent across environments. This includes using the same package manager and maintaining the same versions of dependencies. Use dependency management tools (like npm for Node.js, Maven for Java) and lock files to ensure that the same versions of dependencies are used in both development and production. Regularly update and synchronize these dependencies across environments.
  • Data and State Management: While the data in dev and prod will differ, the mechanisms for handling data and state (like databases and session management) should be similar, if not identical. While using different datasets, ensure the same database schemas and data handling mechanisms are employed. For state management, use similar session handling or caching strategies in both environments. Consider tools to generate production-like data for dev, and data scrubbing systems which will let you safely migrate some aspects and types of live data back to dev.
  • Runtime Environment: The operating system, runtime, and any system libraries should be as similar as possible to avoid “works on my machine” issues. Containerization is a great help with this. Aim for identical OS versions and runtime environments. Using containers can help achieve this by encapsulating the runtime environment along with the application.
  • Build and Release Process: The process of building and deploying the code, including any build scripts or CI/CD pipeline configurations, should be consistent across environments. Implement a CI/CD pipeline that is used for both development and production deployments. This ensures the same build and release process is followed for all environments.
  • Network Topology: The way components communicate with each other (like microservices communicating via REST APIs) should be replicated in both environments to avoid network-related issues in production. Replicate the production network setup in development, including aspects like load balancers, DNS configuration, and service communication protocols. Tools like service mesh can help in mirroring the network setup.
  • Resource Allocation: Production environments may have more resources (like CPU, memory), but the nature of the resources and how they are allocated should be mirrored as closely as possible in development. While resources may differ in scale, ensure they are of the same type (e.g., same database engine). In development, use tools to simulate production resource constraints for more accurate performance testing.
  • Security Practices: Security configurations and practices, including encryption, authentication, and authorization mechanisms, should be consistent to ensure that security testing in dev is valid for prod. A common issue around this is when a developer has admin access in dev, but then the application breaks in prod because a critical account has more limited access in the live system. Apply the same security configurations, such as SSL certificates, firewall rules, and access controls, in both environments. Regularly review and update these practices to maintain parity.
  • Monitoring and Logging: The tools and practices for monitoring and logging should be the same in both environments to ensure that issues can be diagnosed similarly in dev and prod. Use the same monitoring and logging tools in both environments. Ensure that logs are formatted and managed consistently, and that monitoring setups (like dashboards and alerts) are replicated (and tagged with the appropriate environment that they’re observing).

Example of the Dev/prod Parity Factor In Practice

A web application uses Docker containers to run its services. The development environment uses Docker Compose to replicate the production setup locally, including the same types of databases and external services, ensuring that developers work in an environment that closely resembles production. The application’s deployment pipeline, managed by a CI/CD tool like Jenkins or GitHub Actions, automates the deployment process, ensuring that the same steps and checks are followed for every environment.

Pitfalls and Complexities of the Dev/prod Parity Factor

Often, development environments have less powerful hardware resources compared to production, which can lead to performance discrepancies. Regularly perform load testing and performance testing in an environment that mirrors the production hardware as closely as possible. Use cloud services that allow you to replicate production-like environments for testing.

The data used in development and testing environments may not accurately reflect the complexity and scale of production data. Use data masking and anonymization techniques to create realistic, production-like datasets for development and testing. Periodically refresh these datasets to reflect the changing nature of production data. Clearly split application configuration data from user data (e.g. lookup values to populate a dropdown are a part of the application configuration, while information about which options a user selected are not) – application configuration data should be able to be almost identical between dev and prod.

11. Logs

The Logs factor emphasizes treating logs as event streams (as opposed to file-based logging), which offers significant benefits for application monitoring and debugging. By streaming logs from the application, they can be captured, aggregated, and analyzed by external systems, allowing for more sophisticated monitoring and analysis without burdening the application itself. This approach enhances the observability of the application, making it easier to track down issues, monitor performance, and understand user behavior in real-time. By following the Logs factor, applications become more maintainable and their operational issues more tractable, significantly enhancing the overall observability and reliability of the system.

Implementing the Logs Factor

To implement the Logs factor, configure your application to write logs to the standard output (stdout) instead of writing to log files. This allows the runtime environment (like a container orchestration platform or a cloud service) to capture these logs and route them to a centralized logging system, such as an ELK (Elasticsearch, Logstash, Kibana) stack, Splunk, or a cloud provider’s logging service such as DataDog. The centralized system can then aggregate logs from multiple sources, providing a unified view and enabling more effective searching, monitoring, and alerting. Additionally, consider implementing structured logging (where logs are written in a structured format (like JSON), to facilitate easier parsing and analysis by the logging systems), tracing tools like XRay, and log reference codes (which make it easier to link user issues back to logs, and log lines back to places in the code).

Example of the Logs Factor In Practice

In a cloud-based microservices architecture, each service writes logs to stdout. These logs are captured by the cloud platform and forwarded to an ELK stack. Elasticsearch aggregates and indexes these logs, Logstash processes them, and Kibana provides a dashboard for querying and visualizing the log data, giving developers and operators insights into the system’s performance and issues.

Pitfalls and Complexities of the Logs Factor

High-volume log data can be overwhelming and difficult to manage, potentially leading to increased costs and complexity in log storage and analysis. Implement log rotation and archiving strategies in your logging system. Use log sampling or filtering to capture only the most relevant log data. Make use of different log levels for different environments.

Inconsistent log formats across different services can make it challenging to aggregate and analyze logs effectively. Adopt a standard logging format (like JSON) across all services. Use logging libraries that support structured logging and ensure consistent implementation across your application’s components.

12. Admin Processes

The Admin Processes factor promotes running administrative and management tasks as one-off processes, ensuring that these tasks are performed in an environment identical to the regular application but do not interfere with the application’s normal operation. This approach enhances the security and stability of the application, as one-off tasks like database migrations or batch scripts are isolated from the regular runtime environment. It also allows for more flexibility and control in managing these tasks, as they can be executed as needed without redeploying or interrupting the application. By adopting the Admin Processes factor, applications can maintain a clear separation between regular operations and administrative tasks, leading to improved stability and easier management.

Before the widespread adoption of principles like the Admin Process factor, administrative tasks such as database migrations, data cleanup scripts, or batch jobs were often embedded within the application’s main codebase. This meant that these tasks would run as part of the application’s startup or shutdown process, or even during its normal operation. Without a clear separation, running these tasks could interfere with the normal operation of the application. For instance, a heavy data migration could slow down or disrupt the application’s performance during peak usage times. Additionally, Every time an administrative task needed to be run, it often required redeploying or restarting the entire application, which could lead to downtime or other operational complexities. If you’re facing a situation and thinking “we shouldn’t have to do a deployment to get that done”, there’s a good chance that the system your working with is falling short on this factor.

Implementing the Admin Processes Factor

Implementing the Admin Processes factor involves setting up an environment where administrative tasks can be executed independently of the application’s regular runtime. These tasks should be kept in the same code repository as the application but should be runnable in isolation. Tools like Rake for Ruby or npm scripts for Node.js can be used to define and run these tasks. In a cloud or containerized environment, these one-off processes can be run in containers that are identical to the application’s containers but are used exclusively for the task at hand. It’s important to ensure that these tasks have access to the same environment variables and configuration settings as the regular application to maintain consistency.

Example of the Admin Processes Factor In Practice

In a web application, a database schema migration needs to be performed. Instead of incorporating this into the application’s startup process, a separate script is written. When it’s time to execute the migration, the script is run as a one-off process in the same type of environment as the application, using the same database configuration as the application does, but without affecting the running application instances.

Pitfalls and Complexities of the Admin Processes Factor

There’s a risk of inconsistencies between the environment in which the one-off tasks are run and the application’s regular environment. Use Infrastructure as Code (IaC) to define the environment, ensuring that the one-off tasks are run in an environment that is identical to the production environment.

Managing resources and access for one-off tasks, especially in a shared environment, can be challenging. Implement access controls and resource allocation policies that ensure one-off tasks do not consume excessive resources or interfere with the running application. This might involve setting CPU and memory limits on the containers or instances running these tasks.

Conclusion

The 12 Factor App methodology has been a foundational guide in the evolution of cloud-native application development, but it is important to recognize that the landscape of software development is continually evolving. Some aspects of the 12 Factors might feel dated as new technologies and paradigms emerge or as what was once revolutionary is now common practice. However, the core principles of the 12 Factors, focusing on modularity, scalability, and maintainability, still hold significant value. They provide a fundamental framework that can be adapted and expanded upon with newer practices and tools. As we move forward, it’s crucial to build on these foundations, integrating the lessons they offer with innovative approaches and technologies to stay at the forefront of software development efficiency and effectiveness. The 12 Factor App methodology, therefore, serves not just as a static checklist, but as a starting point for really understanding modern practice and for continuous improvement and adaptation.

An Overview of Virtual Private Clouds (VPCs) November 8, 2023

Posted by ficial in Blogroll, brain dump, techy.
Tags: , ,
add a comment

In the ever-expanding realm of cloud computing, Virtual Private Clouds (VPCs) are one of the essential building blocks for creating secure, scalable, and manageable network environments. VPCs provide a logically isolated segment within a cloud provider’s infrastructure, enabling organizations to deploy and manage their cloud resources with enhanced control and privacy. So.. what does that actually mean? Here’s an analogy….

VPC as a Party

Imagine you’re planning a grand party at your home, inviting your closest friends and family. To ensure a memorable and enjoyable event, you’ll need to organize your home and manage the flow of guests – much like how Virtual Private Clouds (VPCs) organize and manage network resources in cloud environments.

Think of your home as a cloud environment, and the party as an application or service running in the cloud. Just as you would divide your home into different areas for different purposes, such as the living room for socializing and the kitchen for cooking, VPCs allow you to divide your cloud environment into subnets for different types of resources.

To control who can enter your home and which rooms they can access, you would use door locks and keys. Similarly, VPCs use security groups to control network traffic, allowing only authorized resources to communicate with each other. Imagine security groups as door locks with specific keys, ensuring that only authorized traffic can enter your VPC subnet.

To connect your home to the outside world, you would use your front door. In the cloud, VPCs use gateways to connect to the internet or other VPCs. Think of gateways as your front door, allowing controlled access between your VPC and the outside world.

Managing a VPC is like managing your home during the party. You need to ensure that guests can move around freely within your home while preventing unauthorized access or disruptions. In the cloud, VPC management involves monitoring network traffic, identifying potential issues, and adjusting security rules to maintain a secure and efficient environment.

Just as a well-organized home enhances the enjoyment of your party, well-managed VPCs improve the performance and security of your cloud applications and services. By creating subnets, implementing security groups, and utilizing gateways, you can effectively manage your cloud network, ensuring that your resources are secure, performant, and accessible as needed.

Why VPCs Matter

VPCs play a crucial role in cloud environments by offering important benefits:

  • Isolation and Security: VPCs isolate cloud resources from other tenants in the cloud, preventing unauthorized access and ensuring data privacy. Think of it as having your own private neighborhood within a larger city, where only authorized residents can access your property.
  • Network Segmentation: VPCs allow for the creation of subnets, which further divide the network into smaller, more manageable segments. This segmentation is analogous to dividing your neighborhood into districts, each with its own set of rules and regulations.
  • Granular Control: VPCs enable granular control over network traffic using security groups, which act as virtual firewalls. These security groups define rules that specify which resources can communicate with each other and with the outside world. Imagine security groups as security personnel at the entrance to your neighborhood, carefully screening visitors based on their credentials.
  • Scalability: VPCs can be easily scaled up or down to accommodate changing network demands. This flexibility is akin to expanding or contracting your neighborhood as the number of residents grows or shrinks.
  • Cost Optimization: VPCs can help optimize cloud costs by enabling resource isolation and preventing unauthorized traffic, reducing the overall network footprint. Consider it as managing your neighborhood’s utilities efficiently to minimize unnecessary expenses.

VPCs address several key challenges in cloud networking:

  • Security Concerns: VPCs provide a secure environment for cloud resources by isolating them from other tenants and enforcing granular access controls.
  • Network Complexity: VPCs simplify network management by dividing the network into subnets and providing clear boundaries between different types of resources.
  • Traffic Management: VPCs enable efficient traffic management through security groups and gateways, preventing network congestion and ensuring optimal resource utilization.
  • Compliance Requirements: VPCs facilitate compliance with data privacy regulations by isolating sensitive data and controlling access to it.

VPCs and the Cloud Computing Quartet

VPCs seamlessly integrate with the fundamental components of cloud computing – compute, storage, network, and management – to provide a holistic and cohesive cloud environment. In essence, VPCs serve as a unifying layer that connects and manages the various components of cloud computing, ensuring a secure, performant, and scalable cloud environment. They provide the foundation for deploying, managing, and protecting cloud resources, enabling people to effectively utilize cloud computing services.

VPCs and Compute

VPCs provide a secure and isolated environment for deploying and managing compute resources, such as virtual machines (VMs) and containers. This isolation ensures that compute resources are protected from unauthorized access and that only authorized traffic can access them. VPCs also enable granular control over network traffic between compute resources, optimizing performance and preventing network congestion.

VPCs and Storage

VPCs facilitate the secure storage of data within a cloud environment. By isolating storage resources from other tenants, VPCs protect sensitive data from unauthorized access. Additionally, VPCs enable granular control over network access to storage resources, ensuring that only authorized compute resources can access specific storage volumes.

VPCs and Network

VPCs form the core of cloud networking, providing a logically isolated network environment within the cloud provider’s infrastructure. VPCs enable the creation of subnets, which further segment the network based on function or security requirements. They also provide granular control over network traffic using security groups, ensuring that only authorized traffic can flow within and outside the VPC.

VPCs and Management

VPCs simplify cloud network management by providing a centralized platform for configuring and monitoring network resources. VPC management tools allow administrators to define subnets, configure security groups, and monitor network traffic patterns. This centralized management approach reduces complexity and improves the overall efficiency of network management.

VPC Practical Applications

VPCs are widely used in various cloud scenarios. Here are some common use cases:

  • Hosting Web Applications: VPCs provide a secure and scalable environment for deploying web applications, isolating them from other tenants and ensuring data privacy.
  • Running Virtual Machines: VPCs enable the deployment of virtual machines in a controlled environment, providing granular control over network access and security.
  • Managing Databases: VPCs facilitate the secure storage and management of sensitive databases, preventing unauthorized access and ensuring data integrity.
  • Developing and Testing Applications: VPCs provide a sandbox environment for application development and testing, preventing accidental or malicious changes from affecting production environments.

VPCs for Every Occasion, or Not

Virtual Private Clouds (VPCs) offer a wealth of benefits for cloud networking, but they are not always the most appropriate solution for every cloud environment. Understanding when and where to use VPCs is crucial for optimizing cloud resource utilization and ensuring network security.

Here some scenarios for which VPCs would be very valuable:

  • Multi-Resource Cloud Environments: When deploying multiple cloud resources, such as web servers, application servers, and databases, VPCs provide isolation and security, preventing unauthorized access and ensuring data privacy.
  • Sensitive Data Management: When handling sensitive data, such as financial or medical records, VPCs offer enhanced security by isolating data storage and controlling network access, ensuring data integrity and compliance with regulations.
  • Secure Application Development and Testing: VPCs provide a sandbox environment for developing and testing applications, preventing accidental or malicious changes from affecting production environments.
  • Hybrid Cloud Architectures: When connecting cloud resources to on-premises infrastructure, VPCs facilitate secure and controlled communication between the two environments.

Here some scenarios in which a VPCs might not be worth the overhead:

  • Single-Resource Cloud Deployments: When deploying a single cloud resource, such as a simple web server, using a public IP address may be sufficient, as the need for isolation and granular control is minimal.
  • Limited Network Traffic: When network traffic is minimal and security concerns are low, using a public IP address may be adequate, as VPCs introduce additional complexity.
  • Temporary Cloud Deployments: For short-lived cloud deployments that do not require long-term network configurations, using a public IP address may be more convenient than setting up a VPC.

Also note that VPCs are not free – VPCs may incur additional costs compared to using public IP addresses. For organizations with tight budget constraints, evaluating the cost-benefit ratio is crucial.

In summary, VPCs are valuable tools for cloud networking, particularly when dealing with multiple resources, sensitive data, or complex network environments. However, for simple deployments with minimal security requirements, using public IP addresses may be a more cost-effective option. Carefully evaluating the specific needs and constraints of each cloud environment will guide the decision of whether to implement VPCs or opt for alternative approaches.

Conclusion

VPCs have become an integral part of cloud infrastructure, providing organizations with the tools to create secure, scalable, and manageable network environments. By understanding their concepts, benefits, and applications,  VPCs can effectively leverage cloud computing to help organizations achieve their strategic goals. As cloud technology continues to evolve, VPCs will remain at the forefront, ensuring that network infrastructure remains robust, secure, and adaptable.

A conflict resolution and alignment building process November 1, 2023

Posted by ficial in brain dump, engineering leadership.
Tags: , , ,
add a comment

I recently wrote an article about understanding and handling conflict resolution. Conflict resolution is closely tied to building alignment, and that article grew out of my thinking about how the two are related. As a part of writing that I detailed out the process I use for both, but that article was not a place to share the gory details. Happily, I also have a blog, which is a great place for that kind of thing :)

Whether it’s a new project, a challenging situation, or a conflict of some sort to resolve, this is the general process I use to build alignment and resolve conflicts within and across teams. This may be compressed and informal, or extended and formal, depending on the scope of what I’m facing.

  1. understand the situation – what is it, and how much alignment building is really needed
    1. get a general understanding of the situation
      1. what’s the history as well as the current reality
    2. figure out who’s involved
      1. any key relationships to be aware of
    3. look for opportunities to short-circuit the rest of this process (for small scale situations)
      1. is there a simple miscommunication that could be fixed?
        1. wrong communication
        2. confusing communication
        3. conflicting communication
      2. do people just need a third perspective?
      3. is this really an alignment problem, or just that someone needs to make a decision?
  2. get people together for a real-time discussion
  3. establish the rules of play
    1. get people on board with “disagree but commit”
    2. value everyone and all perspectives
      1. acknowledge emotions
    3. show respect
    4. acknowledge challenges / difficulties
      1. demonstrate understanding of the situation
      2. reference the history leading up to this
    5. discuss satisficing vs optimizing – the optimization trap; the bar can be set as high (or low) as we like, but the outcome just needs to surpass that bar, not to be perfect
  4. identify/establish fundamental points of commonality, especially around values and broad goals
    1. articulate them and call them out; highlight good fit
    2. engage all participants
  5. clarify the current and specific goal / outcome / problem; ensure common/shared understanding
    1. establish the boundaries – not trying to solve everything, there are important things that are not yet handled
    2. satisficing vs optimizing – what does success and alignment look like here, at least at a high level?
  6. draw the line between the fundamental and the specific
  7. provide context / framing for the specific
  8. get people talking
    1. canvas points of view
      1. celebrate the value of each
    2. acknowledge differences
    3. determine which differences actually matter
      1. goal impact – how relevant is this for the goal
      2. internal people impact – what do people care about
      3. external people / business impact – what does this mean for the org
    4. resolve differences
      1. clarify and refine what a successful resolution looks like
      2. set the tone and framing for the discussion
        1. rational
        2. but also don’t completely devalue the emotional side of things
      3. get people on board with “disagree but commit”
      4. ideally come to a consensus
      5. if a consensus isn’t possible, make a decision
        1. explain your reasoning; get feedback and adapt as necessary
        2. acknowledge all input and demonstrate it’s value to you
        3. keep things moving
  9. synthesize inputs into a shared outcome
  10. share conclusion / plan, and how it is flexible / adaptive
    1. identify pre-set check points
    2. identify breakpoints / signal events
    3. identify channels for communication
    4. clarify process for revisiting and changing things
      1. also who would be involved and channels for future input
  11. make sure conclusion is endorsed and supported
    1. “disagree but commit”
  12. show appreciation for engagement
  13. build energy for next steps

Product Development – A Perspective from an Engineering Leader October 25, 2023

Posted by ficial in brain dump, engineering leadership, techy.
Tags: , , , ,
2 comments

Understanding product development is essential, as product development is the lifeblood for businesses -driving innovation and market competitiveness. And, while the end goal of creating a product is crucial the path to get there is filled with invaluable insights and opportunities. For software engineers, this process ensures harmony between their technical contributions and the  product’s and business’ broader objectives. This alignment enhances the final product’s impact  and streamlines the development process, fostering collaboration and driving business growth. From a larger business perspective, a robust product development process enables good decision making – what a business chooses to spend its time and resources on is critical to its success. This article provides a concise overview of the stages in product development – from ideation to post-delivery, based on my experiences and observations as a developer, manager, and director.

Product development is a nuanced and multifaceted process, often shaped by the methodology adopted and the scale of the project. While the specifics of this process can vary significantly from the structured approach of waterfall development to the flexibility of agile approaches, the fundamental steps are the same. Also, although I’m talking in terms of product development this is applicable to new feature development on an existing product as well – a feature is in many ways (especially in the ways that matter here) just a smaller product in a smaller and more tightly defined domain.

Product Development in a Nutshell

Product development generally has these steps in this order.

  1. Ideate
  2. Choose
  3. Clarify
  4. Decide
  5. Prepare
  6. Implement
  7. Deliver
  8. Post-Delivery

The decision point in step 4 is called out because it’s a critical inflection point in the amount of time and resources the business is committing to a project, but every step includes at least one active review and decision to move forward or not.

Caveats and Commentary

There is some unavoidable blending and cross-dependency for those steps – e.g. the decision might depend on how long it will take, which you don’t really know until the plan is done, which isn’t really clear until some implementation details are figured out, and so on. That’s where ballparks, t-shirt sizes, SWAGs, acceptable risk, risk management, and so on come in – things will never be clear and complete, so we rely on good-faith best guesses and estimates to keep things moving. The goal is to avoid big, costly, and possibly irreversible mistakes. Accept the small mistakes and adapt to recover from them quickly, and be willing at any point in the process to re-evaluate things and set aside work-to-date in favor of better uses of everyone’s time.

In the world of waterfall development, these steps tend to be highly robust, detailed, centralized / hierarchical, and to happen on scales of weeks and months. There are probably artifacts for every step, specialized teams for many steps, and multiple levels of review and decision making. The product is probably much bigger, and the risk tolerance much lower.

In the world of agile development these steps tend to be looser, more blended, decentralized / flat, ephemeral (resolved in a discussion, or just in one person’s mind), and to happen on the scales of hours or days. Possibly the only artifacts are the work ticket and the code and media coming out of the implementation. The product being built is probably smaller, and the risk tolerance is much higher. There may also be a larger product planned out with a more robust product spec that provides a distant goal, while walking the path to that goal is approached with agile iterations (and the goal may change based on feedback from those iterations).

Detailed Steps

The one-word steps in the TLDR can be a bit vague and high level. Here’s a more detailed (but not too detailed) breakdown of the product development steps:

  1. Ideate – “let’s make a Widget!” – Generate and collect ideas for what might be built. Idea sources include specific user requests, market analysis, user interviews, leadership brainstorm, etc.
  2. Choose – “of everything we could do, we’re focusing on this first” – Review the ideas, probably attach some meta data to them (e.g. impact, business alignment, scale, etc.), and pick one to move forward.
    1. Business Analysis (light weight)
    2. Resource Analysis (light weight)
  3. Clarify – articulation, explain, and justify – “this is what a Widget is”, “this is why we should make it”, “this is what it will take to make it” –  From the chosen idea create a PRD (Product Requirements Document). This focuses on the problem the product solves, its expected users, the business case, and other high-level considerations. A PRD covers the “why” and “what” of the product. It provides context, drives early decision making, and serves as a strategic guide for subsequent stages of product development.
    1. Review and Revision – “this isn’t clear”, “this won’t pay for itself”, “there’s an existing Gadget that’s very similar”, “here’s a related opportunity that’s even better”, etc.
      1. Resource Analysis and refined projections – “this is how long it will take / how much it will cost”, “this is the revenue we expect”, “this is the likely adoption curve”, etc.
      2. Business Analysis
  4. Decide – “we’re doing this” or “we’re not doing this (yet)” – Based on the product spec, along with the strategic priorities of the business, someone needs to make the decision on whether the business commits the resources to building that product or whether those resources would be used more effectively elsewhere. There are actually many go-forward decisions embedded within every other step of this process, but the line between the product spec and beginning the development effort is the largest inflection point in terms of the resources dedicated to a project. This is usually a point where someone higher up in an organization is called on to make a decision.
  5. Prepare – before diving into implementation, create clarity and confidence in what is actually being built and how; good groundwork here pays off in much faster and easier implementation and better outcomes
    1. Project Plan – “we said it would take this long, and here’s how we’ll be using that time”
    2. Specs and Requirements – there are often interdependencies here (e.g. can’t really nail down the tech reqs and UX reqs until the product specs are finalized), but as much as possible these are created and refined in parallel. There should be a lot of iterative refinement here as each area adapts to changes in the others, converging on specs and requirements that are stable enough to proceed.
      1. Product Spec – “this specifically is what we’re building and what it needs to do, how we know we’re done building it, and what success means”, “these are the functional requirements”, “these are the non-functional requirements”, “this is the scale of use”, “this is out of scope” – describes the intended features, functionality, and behavior of a product in detail. It serves as a blueprint for developers, designers, and other stakeholders involved in the product’s development.  The product spec typically includes information about the product’s purpose, features, user flows, interfaces, and any constraints or requirements. The product spec aims to ensure that the development team understands what needs to be built and how it should function, minimizing misunderstandings and reducing potential rework.
      2. Technical Requirements Doc (TRD) – “this is what we need from a technical perspective to meet the product requirements”, “these are the access patterns we need to support”, “this is the data we need to persist” – this outlines access patterns, API contracts, persisted data structures, etc. This doc may also indicate technical skill sets and domains that need to be covered when resourcing the implementation.
      3. UX Requirements – “this is what users need to make effective use of the product” – This goes into detail on the actions users need to take and the kinds of information that need to be presented to users. This highlights specific UX challenges that need to be resolved. This also indicates any patterns and practices that need to be followed to give users a consistent experience.
    3. Implementation Plans – these are worked on in parallel. These provide a guide for actual implementation, minimizing decision making that has to be done at that time.
      1. Technical Implementation Plan (TIP)- “this is how we’re going to implement / meet the technical requirements”, “this is the order, if not schedule, for implementation”, “this is how the implementation will be validated”, “these are the tools and technologies that will be used, and how they’ll be used”. This doc may further specify skill sets and technical domains that need to be covered in resourcing.
      2. UX Implementation Plan (Wireframes) – “this is how we’re going to implement / meet the UX requirements”, “this is the order, if not schedule, for implementation”, “this is how the implementation will be validated”, “here are wireframes”
    4. Progress Review – “how is the project going in terms of time and cost”, “do the product requirements need to be adjusted to meet the projected schedule”, “does the schedule need to be adjusted to meet the product requirements”
    5. Go-forward Decision – “keep going, do the actual implementation”, or “on review, this is not what we should be doing right now” – this is the second major inflection point in resource commitment. 
  6. Implement
    1. Resourcing – assign people to do the work
    2. Technical Implementation – write the code
    3. UX Implementation – make the designs
    4. Integration – make the various pieces work with each other
    5. Validation / Testing – make sure what’s done so far works and fulfills the requirements as expected
    6. Set Up Operational Monitoring – make sure the tooling is in place to observe the parts of the system that need to be observed to ensure on-going correct operation and allow proactive issue mitigation
    7. Set Up KPI Tracking – make sure the tooling in in place to collect and understand the data that shows whether the product is successful from a business perspective
    8. Update/Create end-user documentation and internal documentation – don’t go overboard, but don’t ignore this either
    9. Periodic Progress Review & Go-forward Decision – “at these milestones”, “at these times” – make sure things are on track, and always be willing to accept that a review reveals that the project no longer is the best thing for people to be working on and work so far may be set aside in favor of something else. The longer the time frame and the more resources involved the more important this is.
      1. Adjust requirements and/or timeline as needed – usually this means “cut scope to meet a deadline”, but “we need to push out the deadline” and “we’ve learned more and so need to add more requirements” are also possible
  7. Deliver
    1. Final Validation – “technical testing”, “confirm monitoring and tracking”, “check off product requirements”, “review and approval by stakeholders”, “this is good enough to release”
    2. Go-forward decision – “give this to our users”, or “needs more work, kick it back to implementation (or earlier)”
    3. Deployment
  8. Post-Delivery -this isn’t really a ‘step’ per se since most of it can’t really be marked ‘completed’, but I include it as a state that needs to be accounted for in the course of operation over time.
    1. Operational Monitoring
    2. KPI Tracking
    3. Release clean up – “remove feature flags”, “make available to full user pool”, “remove dead code”, etc.
    4. Bug Tracking and Resolution
    5. Support
    6. Dependency monitoring
    7. Resolve dependency-driven disruptions
    8. Documentation updates
    9. End-of-life

This sort of flow may also be nested or sequenced in a larger initiative. For example:

  • part of the review requires a proof-of-concept or prototype, so that needs to be completed as it’s own sub-project
  • defining the requirements needs feedback from users on a partial implementation, so something needs to get out the door asap even if the full project isn’t done
  • the implementation loop needs an alpha release to get feedback from users to resolve a complex UI challenge

The kinds of cases in the examples above are just part of the normal sequence and practice of agile product development (i.e. in very tight loops – get something done enough to get in front of users, get feedback, decide on direction / next steps, iterate & refine).

Conclusion

In the process of product development, every step in the product journey holds its weight. From the first spark of an idea to the final curtain call, informed decisions and flexibility are key. Whether you lean towards the structure of waterfall or the flexibility of agile, it’s all about making smart choices and staying open to feedback and change – staying adaptable and continuously learning will always be at the heart of success.

Software Engineering Aphorisms September 1, 2020

Posted by ficial in brain dump, software, techy.
Tags: , ,
add a comment
  • The robustness of the tool is proportional to the size of its user base.
  • You cannot create the solution until you can clearly articulate the problem.
  • ‘Why’ is usually more important than ‘How’ or ‘What’.
  • Self-documenting code is foundation for good comments and good documentation, not a replacement for them.
  • Code to create fulcrums.
  • It’s easy to tell if code is working, it’s hard to tell if it’s doing the right thing.
  • Sometimes the work you have to do doesn’t help you become a better engineer, it just helps you get paid.
  • Special cases are the Devil.
  • You can’t be done with something until you know how to tell if you’re done.
  • Bits are cheap, go ahead and use a longer name.
  • Names have power.
  • It should be clear from a name how a thing is used/useful and why it’s important.
  • Where you put the braces doesn’t actually matter. Let it go.
  • The best engineers code in functionality, not a language.
  • Fail fast is only useful if you learn from the failures and start doing something differently.
  • Knowing is half the battle…. but 50% is still a failing grade.