Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alfresco db content size and contentstore size: How to match?

I have an SQL query to get the content size from a content URL, but when I execute the SQL and compare the result to the directory size of contentstore it doesn't match. For example first I get the content size from database (PostgreSQL):

SELECT count(*),
       SUM(round(u.content_size/1024/1024,2)) AS "Size (MB)",
       to_char(n.audit_created:: date,'yyyy') AS yyyy_
  FROM alf_node            AS n,
       alf_node_properties AS p,
       alf_node_properties AS p1,
       alf_namespace       AS ns,
       alf_qname           AS q,
       alf_content_data    AS d,
       alf_content_url     AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
GROUP BY yyyy_
ORDER BY yyyy_ DESC;

It gives me a total size of 600 GB. But when I see the size property for folder content store in alf_data it shows 2.4 TB. How is that possible?

like image 533
GaneshK Avatar asked Sep 14 '25 18:09

GaneshK


1 Answers

It is not so easy to know the space of the contentstore based only on what you currently have in your repository.

When a document is deleted or updated with a new version, Alfresco does not immediately delete the binary from the disk.

The process to actually free the disk space is a bit more "complex". How to configure the first step of this process is described in the following article: Cleaning up orphaned content (purge)

The "purge" by default is configured to be executed every day at 4AM and will purge any orphaned file that has been removed from the repository at least 14 days earlier.

As described in the article, by default, Alfresco does not delete the purged files but it only moves them in a different folder (contentstore.deleted).

It is safe to delete the files in the contentstore.deleted after a backup.

As usual, in alfresco-global-properties, you can modify the default configuration (that you can find in alfresco/repository.properties)

You can end up with a lot of orphaned files in many ways. Just to give some examples:

  1. the purge is not working for some reason
  2. someone recently has imported and then deleted many and/or big files
  3. someone has used the alfresco export, that creates the zip file in the repository
  4. the system is used to store short-living documents that are often deleted or updated (like big videos), etc...
like image 56
Marco Altieri Avatar answered Sep 16 '25 10:09

Marco Altieri