Final Approach of Refactoring ORMs & Repositories for a Better Attributes Management

4 minute read

Published:

I’ve already written three posts (including this one) related to refactoring ORM and repository modules for the sake of a better attributes management.

This one should be the last part.

Let me give you a brief overview of the previous post in case you haven’t had a chance to read it.

The primary goal of this refactoring is to be able to manage column changes by involving minimum number of modules.

The previous post presents an approach where we can do all the column management operations in a single file only, that is TableORM.py.

For the sake of clarity, here are the modules using the previous approach.

MODULES USING THE SECOND APPROACH

File: TableConfig.py

def get_all_columns():
	return [
		field_a, field_b, field_c, field_d, 
		
		
		field_final
	]

File: StorageUtil.py

class MyTableUtil(object):
    @staticmethod
    def get_orm_columns(preserved_columns, repo_columns):
        orm_columns = []

        for column in repo_columns:
            if column not in preserved_columns:
                column_type = ORM_SCHEMA_MAPPING[DATA_SCHEMA[column]]
                orm_columns.append(Column(column, column_type))

        return orm_columns

File: TableORM.py

from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base

from TableConfig import get_all_columns
from StorageUtil import MyTableUtil

Base = declarative_base()


class TableORM(Base):

  __tablename__ = my_table
  
  @staticmethod
  def _get_columns():
      # stores all the specific columns
      columns = [
          Column('field_a', String, primary_key=True)
      ]
      
      return columns + MyTableUtil.get_orm_columns(preserved_columns, repo_columns)
      
  @staticmethod
  def _construct_table(columns):
      return Table(__tablename__, Base.metadata, *columns)
  
  preserved_columns = ['field_a']
  repo_columns = get_all_columns()
  
  columns = _get_columns.__func__()
  __table__ = _construct_table.__func__(columns)

File: TableRepository.py

from TableORM import TableORM


class TableRepository(object):
	__metaclass__ = abc.ABCMeta

	def __init__(self):
		self._columns = TableORM.repo_columns

As you can see, we only need to declare all the columns (preserved_columns and repo_columns) in TableORM.py. Since TableRepository.py also needs to have the repo_columns, we can retrieve them by using TableORM.repo_columns.

Now, let’s talk about the problem.

Our TableRepository class is an abstract class. There are two child classes inheriting from it, let’s call them TableRepositoryChildA and TableRepositoryChildB.

In addition, even though both child classes use the same repository columns, only TableRepositoryChildA that uses ORM. Therefore, making these two child classes have the _columns attribute is not a good idea in term of module architecture. I think there’s a need to make TableRepositoryChildB independent from ORM since it never uses it.

Perhaps you wonder, why didn’t I create an init method for TableRepositoryChildA which stores the _columns attribute. Well, a good idea. But we’ll have to declare all the repository columns for TableRepositoryChildB then. Doing so means that we need to go back to the first approach, where we declares all the columns using two modules, those are ORM and Repository.

So, how is the current approach?

THE CURRENT APPROACH

Since only one of two classes using ORM, the columns declaration is done in repository modules.

To make it short, here are the updated modules.

File: TableRepository.py

from sqlalchemy.ext.declarative import declarative_base

from StorageUtil import ChildARepository, MyTableUtil
from TableConfig import get_all_columns

Base = declarative_base()


class TableRepository(object):
	__metaclass__ = abc.ABCMeta

	@staticmethod
	def get_columns():
		return get_all_columns()


class TableRepositoryChildA(ChildARepository, TableRepository):
	@staticmethod
	def get_orm_columns() -> [Column]:
		preserved_columns = [field_a]
		nonpreserved_columns = list(set(TableRepositoryChildA.get_columns()) - set(preserved_columns))

		preserved_orm_columns = [
			Column(field_a, Integer, primary_key=True)
		]

		return preserved_orm_columns + MyTableUtil.get_orm_columns(nonpreserved_columns)


class TableORM(Base):
	__table__ = Table(table, Base.metadata, *TableRepositoryChildA.get_orm_columns())

As you can see from the above code snippet, the TableORM has been moved to the repository module. This was not because we wanted to maintain the column declarations using a single file, but there was a circular dependency problem when importing modules. For example, TableRepository.py imports TableORM.py. Meanwhile, TableORM.py also imports TableRepository.py.

Another thing that you might notice is that the TableRepositoryChildA class now inherits two classes (previously only inherits TableRepository).

This ChildARepository class provides a single module with focus on creating all the columns residing in the ORM.

File: StorageUtil.py

class ChildARepository(object):
	__metaclass__ = abc.ABCMeta

	@staticmethod
	@abc.abstractmethod
	def get_orm_columns():
		pass


class MyTableUtil(object):
	@staticmethod
  	def get_orm_columns(repo_columns):
		return [
			Column(repo_column, ORM_SCHEMA_MAPPING[DATA_SCHEMA[repo_column]]) for repo_column in repo_columns
      		]

The update also occurs in MyTableUtil class in StorageUtil.py. The process of filtering out the preserved columns from the repo columns is done in get_orm_columns method in TableRepositoryChildA class.

As the conclusion, using this approach enables us to do all the column declarations in a single file only, namely TableRepository.py.

  • The first column declaration which is for all the repository columns is done in get_columns method in TableRepository class
  • The second column declaration which is for all the preserved columns is done in get_orm_columns method in TableRepositoryChildA class

Moreover, the class dependency problem that occurred before between TableORM.py and all the classes that don’t use ORM has finally been resolved.

I guess this is the end of this article.

Hope it helps those who’ve been trying to address similar problem.

Thank you for reading.