Logging#
Openverse is in the process of establishing and implementing its logging strategy. Until the introduction of more extensive logging in the Elasticsearch controller modules we did not have an established approach to logging. That PR still does not establish a comprehensive approach, but it does introduce and follow some rules:
Always use a child logger based on the module name and method name. For example, at the top level of a module, there should be a
parent_logger
variable that is the result of callinglogging.getLogger(__name__)
. Individual methods that log should get a child logger for themselves based on this parent logger and their method name:
def apply_filters(self, ...):
logging = parent_logger.getChild("apply_filters")
...
When logging variable values, log them in the following format:
name=value
. Prefer thatname
equals the expression that produces thevalue
. For example, if logging the length of a list namedproviders
,name
should belen(providers)
, with the full expression looking like this:len(providers)={len(providers)}
. If logging a property of an object, preferobject.property_name={object.property_name}
. Exclude serialization from the name likejson.dumps
as this is assumed:verified={json.dumps(verified)}
.Avoid using the
pprint
module for serializing log data as it significantly increases the amount of space and time it takes to write a log. Instead, prefer a simplerjson.dumps
.
These practices provide context in the logs and makes them uniformly searchable
and filterable based on the values assigned to the names. Using a child logger
means we can easily see all the logs for a method. Using the name=value
format
means we always know how to filter any given logged variable either by name or
by name and value.
Openverse also makes use of request IDs. Because multiple workers are writing to the same log at once, any given request’s logs may be interspersed with the logs of another. Filtering on the request ID allows us to see the full logging story of a single request. This allows us to trace problematic requests through the codebase and understand what specific parts of the code are causing problems with these particular requests.
Future improvements#
Potential future improvements to logging in Openverse could include:
Even more uniform data logging format like formatting all logs as JSON.
Establishing clearer practices around what log levels to use and when.