What is metadata and how to use it effectively 


What is metadata and how to use it effectively 

This article is intended for general use. It is not specific to a certain product. 

What is metadata? Simply put, it’s information, or properties, about your document. In SharePoint, if you right-click a document and select More > Properties, you’ll see basic information about your document. This is metadata. It may include author name(s), date created, date modified, file size, and title and tags.

This metadata is basic, and in order to group documents we often rely on folder structure to provide other properties or parameters. 

Why is metadata important? 

Metadata is used in search to help find content. In document management systems, metadata also allows you to better search, filter, group and sort your documents instead of, or in addition to, folders. It can also be used to trigger certain actions like workflows or rules. 

Unlike a file share, document management systems provide a variety of methods to collect additional metadata that you specify, including forms, custom content types (additional metadata fields collected with a document), tags and keywords. The primary challenge units face is to determine what metadata to collect and how to collect it. 

By default, most systems will usually have these fields out of the box: 

  • Filename (auto populated) 
  • Title 
  • Author/ Creator (auto populated) 
  • Description 
  • Date (auto populated) 
  • Comments 

Most of us rely on folders to group or categorize documents, but the problem with only using folders is that they are static. Folders represent one person’s way of sorting or viewing data, and you can’t easily filter or regroup content. For example, if my folders are organized by department with subfolders for years, I can’t easily generate a list of all content across departments for a specific year. However, by using metadata fields instead or, or in conjunction with folders, I can regroup my content on the fly. 

How do I know what metadata I need to collect? 

There is no simple answer. A thorough understanding of your content and your business processes is required. Every group will have different needs. In fact, metadata needs may be different even based on document. 

Think about how you use content, and how you can make your life easier by assigning metadata. Metadata is most useful with respect to reporting, so are there regular reports or status updates you need to provide? 

Perhaps you want to be able to see all documents by student ID, or by department. Perhaps you want to be able to easily extract all documents related to a specific project or application. With a document management system you can easily do this by creating new metadata fields. 

Look at your existing folder structure. Can you reduce the number of levels by replacing a folder with a metadata field? 

In the example below, you’ll see some suggestions for translating folder structure to metadata. 

 

Figure 1 Traditional Windows Explorer folders 


With folders, I have only one way to look at the content. If I need to look at YE files for the last 5 fiscal years, I have to comb through 5 separate folders. If this was structured with metadata, I could easily display only YE files and group them by fiscal year. Plus, I can change this view on demand. Not only does this reduce the number of clicks and folders to scan for the user, but it also gives the user the option to group the content in different ways. 

The examples below show you two ways you can sort and group the same documents, but there are many others. These views are entirely customizable by the user based on their needs. 

 

Figure 2 Grouping document by fiscal year field 

 

Figure 3 Grouping document by category 

How you do this will be different in each system. You will usually have some options on how you define those fields; for example, you can create a pre-populated picklist, specify numeric values, or allow users to assign multiple values. You can also set fields as mandatory or optional. You may be able to set default values or use rules to auto-populate some fields. 

Some other common examples of metadata: 

  • Add a Status field: Track where your files are in the process of getting them out to your customer. It can be useful to know whether the file is: draft, peer-reviewed, manager-reviewed, CEO-approved, or delivered to client, for example. Each time a step is completed, that person updates the status for everyone to know. 
  • Add a Document type field: Categorize your files based on invoices, agendas, meeting minutes, presentations, charts, policies, letters, announcements, or any other type that makes sense. 
  • Add a year or date field: Especially useful if you are storing records related to meetings or other annual events or activities. 

It’s important to collect only what you need. Many users will complain about the additional burden of populating fields, so don’t make things worse by capturing metadata that will never be used. You can relieve some of the stress by populating default values or using built-in capabilities (if your system has them) to automatically populate metadata where possible. 

Keep it updated 

In order for metadata to be of any use to you and your colleagues, the metadata has to be kept up to date at all times. That means whenever a file is created or uploaded, you need to tag it with the categories that are correct; otherwise they won’t show up under those categories. If you’re dependent on those categories, then these files are essentially invisible. 

If you introduce metadata in a document library for your team, you need to talk that through with them so they understand the value. You’re adding a bit of a tax on them: the time it takes to categorize the files. Make sure your colleagues agree that there’s value (or understand that management does), so performing the task of updating metadata will be seen as a positive thing. This is especially true if you are relying on a metadata field to execute an action. 

Content Types and Content Models 

Metadata fields are applied first at the document library level, which means everything within that library will have the same fields. But what if you have different types of content that require different metadata? Say you have meeting minutes (for which you want to capture facilitator and date) and procedure manuals (for which you want to capture area/application and status) in the same document library. 

Document management systems give you the ability to create custom content types. A content type is a category of documents that have common characteristics and can be classified under one umbrella. So in the example above you might create two distinct content types (one for Meeting Minutes and one for Procedure Manuals) and define the associated metadata/ properties. Multiple content types can reside in the same library. When a user uploads a document, they will be prompted to pick which content type it belongs to and then they will be shown the appropriate fields to populate. 

There are several reasons to consider creating multiple (custom) content types. These include: 

  • Executing different actions or rules. For example, if you plan to implement workflows, you will likely want different content types to trigger different workflows (ex. an invoice gets routed to a different person and filed to a different location than a transcript). 
  • Setting different retention policies based on content type 

The procedure to create custom content types is unique to each product, so you’ll need to consult the documentation for your particular document management system. 

Plan Ahead 

It is important to think about what properties you need in advance because once you start to populate your library, it becomes difficult to make a change. You can add or modify fields at any time, but the changes won’t automatically be applied to documents already uploaded; you will have to manually update them. 

Conclusion 

Metadata can provide you amazing results, but: 

Otherwise, it can be an extremely expensive project that provided absolutely no business value.