In this page

Configuring app logging

In case of troubles, the most important is to increase the app's logging level so that it can write out more details about what it is doing.

To increase the logging level:

  1. By default, DEBUG is the lowest logging level Confluence provides. If you need TRACE level logging, do the following steps, otherwise skip this and continue with Step 2.
    • Stop Confluence.
    • Edit <Confluence installation directory>/confluence/WEB-INF/classes/log4j.properties
    • Replace log4j.appender.confluencelog.Threshold=DEBUG with log4j.appender.confluencelog.Threshold=ALL
    • Start Confluence.
  2. Login to Confluence as administrator.
  3. Go to Confluence AdministrationGeneral configurationLogging and Profiling.
  4. At Add New Entry, enter "com.midori" to Class/Package Name and choose "ALL" in New Level.
  5. Click Add entry.
  6. Check if this new entry was correctly added to the Existing Levels list.

Now execute the app function that you think fails, and check your Confluence log to see the details.

If you can't understand the log lines, report what you have found to our support. We are there to help you.

Typical problems and their solutions

Our users are not receiving the notification emails.

Make sure that your users have valid email addresses configured, and the SMTP servers in Confluence are configured and tested. To verify:

  1. Watch any page using a user account that was not receiving the Better Content Archiving notification emails.
  2. Edit that page using another user account.
  3. If the problematic user receives the "page edited" built-in notification email, it means that the user account and the SMTP server is correctly configured.

If you still have the problem, have a look at your Confluence log to see if there are errors logged around email sending. If the email are sent out according to the log, then check your spam folders and your spam detector.

There are "Page (or Attachment) without last modification date found" warnings in my log.

You may sometimes see harmless warnings like these in your Confluence log:

[WARN] Attachment without last modification date found: #53090452 "foobar.jpg"
...
[WARN] Page without last modification date found: #52527113 "My super duper wiki page"

This basically means that Better Content Archiving found a page or an attachment with NULL (unknown) last modification date while traversing your content. As the last update date is unknown, their age (the time since their last update) cannot be calculated and they will be skipped by the Better Content Archiving app.

Please note that in normal circumstances all pages and all attachments should have valid last modification dates. These corrupt ones were probably created by some "incomplete" app, integration or data migration. You can safely ignore these warnings, or decide to fix the data.

How to fix these?

  1. If there is only a few of these, you should just edit the page or re-upload the attachment. The log entry shows the numerical identifier and the page title or attachment filename, so that you can easily find them.
  2. If there are many of these, you should run an SQL UPDATE statement against your database to initialize the value in the corresponding records. One trivial idea is to initialize the missing value to the creation date of the page or attachment, or to the current date or to some other fixed date.

I just installed Better Content Archiving and it shows that all pages are viewed. ("zero not-viewed pages" problem)

Current status: Better Content Archiving 7.4.0 introduces the "page view initialization" feature that solves this problem.

As you probably know, Confluence does not implement any sort of page view tracking. Because the workflows implemented by Better Content Archiving heavily rely on the last page view information, the app implements its own real-time page view tracking mechanism.

As this is the app itself that implements the page view tracking, there are some obvious consequences:

  1. The periods when the app was not installed are not tracked.
    If the app was installed on Jan 10, it cannot know what happened before. Even if you activate page view tracking immediately, it would not report anything. If you set the "not viewed" interval to e.g. 60 days, then the first pages should be reported around March 10 (60 days from the installation date, when the tracking actually started).
    Since app version 7.4.0, this problem is completely eliminated by initializing page views.
  2. Similarly, the periods when the app was not licensed are not tracked.
    Make sure that you have a valid license installed.

I have lots of pages that I know are not viewed, but those are not reported. ("abandoned pages" problem)

Current status: Better Content Archiving 7.4.0 introduces the "page view initialization" feature that solves this problem.

As you probably know, Confluence does not implement any sort of page view tracking. Because the workflows implemented by Better Content Archiving heavily rely on the last page view information, the app implements its own real-time page view tracking mechanism. Even with this mechanism in place, there is (was) a limitation that confused many in the past.

If a page was last viewed before the Better Content Archiving app got installed and never after the installation, the app has zero information about the last view. (We called these pages the "abandoned pages" in our internal lingo.)

The app is designed to skip these abandoned pages to avoid distorted statistics and unexpected archivings. The idea is that if we don't precisely know the page's last view, the safest is skipping the page: not to report it and not to do anything potentially dangerous with it, like archiving.

And, here is the problem. The drawback of this safety-first behavior is that if a page is never viewed after installing the app, it is skipped at every check, effectively staying "under the radar" forever.

This was a confusing non-critical issue with previous app versions, and it is a non-issue since app version 7.3.0. The "page view initialization" feature allows populating last page view information for any page, eliminating "abandoned pages". See the page view initialization section for more.

The execution of the "Move" strategy on a large number of pages is slow.

Page moves were relatively slow in Confluence pre-5.10.18 versions, then got faster in 5.10.18 (due to the rewrite of the corresponding code in Confluence). Unfortunately, they became slower again in newer Confluence versions (in 6.0.1, for instance). This is important to understand that although page moves are not fast, those are completely reliable and stable!

Midori investigated the problem and it turned out that most of the execution time is spent in Confluence core, while it is refactoring (fixing) links pointing to the moved pages. We reported this for Atlassian: CONFSERVER-53503 (which was migrated from the original CE-1045).

We definitely encourage you to vote for the bug, to increase its importance for the Confluence developer team.

At the moment, there is one simple way to accelerate page moves: increasing the heap size. See the instructions.

According to our tests, increasing the heap size from the default 1024M to 1536M, the execution time was reduced by 30% for 800 pages. Your miles may vary, but it is definitely worth a try.

"Last Execution" at the Scheduled Jobs is empty for "Analyze Content Quality" and "Find and Archive Expired Content".

This is a purely cosmetic problem which you can safely ignore. Nevertheless, we describe the root cause below.

Prior to Confluence 5.10, Confluence used the Quartz library to run periodic jobs and required the apps that want to run periodic jobs to declare "trigger" and "job descriptor" type modules. It worked as expected with Better Content Archiving, also capturing the execution history.

In Confluence 5.10, Atlassian changed it to a new technology called Caesium and required apps to declare new module types. Luckily, Confluence still supports the old approach for backward compatibility and it works almost perfectly, but there are some minor glitches like this one.

In fact, jobs are perfectly executed, but their execution history is stored in a cache instead of being persisted to the database, as previously. And, as caches are flushed once a day, the execution history of infrequently executed jobs is cleared, therefore not displayed in the UI most of the time.

Better Content Archiving will be converted to the modern approach at some point. The reason we're not doing this right away is that Better Content Archiving supports any Confluence version starting from 5.7! It means that the conversion made it impossible to support 5.7-5.9, which will be an overly high price for this change.

My background jobs are (randomly) not executed.

Current status: Better Content Archiving 6.0.0 executes background jobs several magnitude faster than previous versions, therefore the chance for collisions is very low since 6.0.0.

Although Better Content Archiving relies on several background jobs, it does not allow multiple of those running concurrently (at the same time), in order to prevent synchronization problems. This is a simple and robust synchronization rule, which works very well in most cases.

Due to this rule though, you may randomly see jobs not being executed. Typical symptoms:

  1. The work that is supposed to be done by the job is not actually done. For instance, the Quality Statistics are not updated by the "Analyze Content Quality" job, or the notification emails are not sent by the "Find and Archive Expired Content" job.
  2. The job's execution history shows you durations of 1-5 milliseconds, which is unrealistically short for an actual execution.
  3. There are warnings like this written to the Confluence log (the text may vary between app versions):
    [WARN] Failed to start the Content Archiving task as Content Quality Analysis is already running

The root cause is simply that the job executions occasionally overlap in time. For instance, if there is job "A" to execute every day at 02:00AM and job "B" to execute every day at 02:30AM, then jobs will only work reliably if "A" can always complete in 30 minutes. If "A" sometimes takes more than 30 minutes, then "B" will be started, detect that "A" is still running, write the warning to the log and exit, thus not be able to run that day.

What causes overlapping jobs?

  1. If you recently changed the schedule of the jobs, then you possibly made some mistake. (Read on to learn how to set up a correctly working schedule.)
  2. Jobs take longer than expected to complete (typically due to the increased size of your data), and the schedule is not fitting this.

Solution: the jobs' schedules need to be configured so that you eliminate the job execution overlaps.

Below we are giving a simple method to find your optimal schedule:

  1. Check the average duration for each job of the app based on the execution history. This includes the 3 jobs whose name starts with "Better Content Archiving" (in top part of the alphabetically sorted job list). You can skip the "Better Content Archiving: Persist the XYZ Journal" jobs, as those are not relevant and are allowed to run concurrently. When calculating the average durations, ignore the executions that complete in less than a second (with an "immediate exit").
  2. Design your optimal schedule based on the job's average durations and your preferences. For instance:
    • I want the "Find and Archive Expired Content" job to run daily around 2AM, it takes 80 minutes, and this is the most important.
    • I want the "Analyze Content Quality" job to run daily around 4AM, and it takes 55 minutes.
    • I want the "Warm up the Content Status Cache" job to run in every 10 minutes, and it normally takes 50 minutes. (Note: this job was removed in 6.0.0+ app versions.)
    In a scenario like this, a potentially good schedule that eliminates the job collision is:
    • "Find and Archive Expired Content" job: every day at 2AM. (It will complete by around 3:20AM.)
    • "Analyze Content Quality" job: every day at 4AM, so that it allows an extra 40 minute safety window for the previous job to complete. (It will complete by around 4:55AM.)
    • "Warm up the Content Status Cache" job: every 10 minutes, excluding the period 0:30AM - 5:30AM. With 50 minutes of execution time it means that it will not run between 1:20AM and 5:30AM, allowing a 30 minute window before the first job and 35 minutes after the second. (Caches are safe to go cold in those hours when your users would not check content status.)
  3. When you designed the schedule, apply that to the jobs.

It may look difficult, but this is fairly trivial while doing, believe me.

Every environment and every team's requirements are a little different, so there is no "one schedule that works perfect everywhere". Once the new schedule is in place, it will work consistently.

I see "net.sf.hibernate.ObjectNotFoundException: No row with the given identifier exists" exceptions.

Current status: Atlassian fixes this in Confluence 6.2.2.

This is a problem that occurs when executing the "Move" strategy in Confluence 6.0.x, 6.1.x, and 6.2.x, but not after 6.2.2. This affects only certain pages, but affects those consistently, meaning that repeated re-runs of the archiving job will fail with the same exception.

See this issue for details, in which Atlassian and Midori investigated the problem: CE-1044. The investigation led to a bugfix released in Confluence 6.2.2.

If you can't upgrade to 6.2.2 right away, then the potential workarounds are:

  1. You keep using all other features of the Better Content Archiving app, but you do not archive pages in the spaces where the problem occurs until you can upgrade. For this, just turn off all page archiving related triggers in the configurations applied to the problematic spaces.
  2. If you identify which is the problematic page, add the noarchive-single label to that to disable the app on that single page. When you upgrade to 6.2.2, remember to remove the label. (You can also use the noarchive label to disable the app on a whole subtree of pages.) To find the problematic page, increase the Better Content Archiving app's logging level, and the problematic page will be the parent of the page which is being moved right before the exception.
  3. You switch to the "Copy and Trash" strategy temporarily. It may happen that switching to that strategy for a single archiving execution will archive the problematic page, and you can switch back to "Move" immediately after that. Before doing this, please see the comparison of the "Move" and "Copy and Trash" strategies to avoid surprises!

The safest option is the first one.

I see "Cannot add an existing ancestor as a child!" exceptions.

When running the "move" archiving strategy, it may result in the java.lang.IllegalArgumentException: Cannot add an existing ancestor as a child! exception in certain spaces.

What's the root cause? Confluence is using an accelerator database table to maintain ancestors (i.e. children of children of pages, recursively to unlimited depth). If this table loses its integrity (not related to Better Content Archiving!), that may lead to this and similar errors.

Please follow the official guide to fix the table, it's easy. (If it does not help for the first try, stop Confluence, run the delete from confancestors SQL statement to clear the table, then restart Confluence and only then rebuild the table.)

After the table's integrity is restored, re-run the archiving.

I see "Comparison method violates its general contract" exceptions.

Current status: Midori gives a workaround for this problem in Better Content Archiving 6.1.1, while Atlassian gives a proper fix in Confluence 6.2.1.

When using the app, some pages might display the java.lang.IllegalArgumentException: Comparison method violates its general contract! exception.

This is caused by a bug in some specific versions of the Java runtime and in Confluence core. We have already reported the problem to Atlassian: CONF-45910. We definitely encourage you to vote for the bug, to increase its importance for the Confluence developer team.

Luckily, there is an easy workaround until the actual bugfix is available! Adding this JVM system property completely eliminates the problem:

java.util.Arrays.useLegacyMergeSort=true

The way of defining the system property depends on your operating system, but this page in the Confluence administrator's guide explains all scenarios. After the restart, your change is picked up and the problem is gone.

I see "java.lang.NullPointerException at NaturalStringComparator.compareNatural()" exceptions.

When using the app, especially while content indexing, the following error is written to your log:

2018-01-02 10:57:10,031 ERROR [Long running task: Content Event Indexing [CARCH]] [service.task.base.AbstractArchivingLongRunningTask] runInternal Failed to index content events
 -- url: /admin/plugins/archiving/start-content-event-indexing.action | referer: /admin/plugins/archiving/statistics.action | traceId: a5b3407986727e99 | userName: admin | action: start-content-event-indexing
java.lang.NullPointerException
	at com.atlassian.confluence.pages.NaturalStringComparator.compareNatural(NaturalStringComparator.java:74)
	...

This is caused by pages that have blank titles (i.e. NULL saved in the database) as those cannot be compared against other pages. These pages are considered corrupt data created by a buggy app, integration or other mechanism that does not obey the data integrity rules in Confluence (which would require all pages have non-blank titles).

These should be fixed not only for the sake of Better Content Archiving, but for all Confluence features work properly. Please follow the official guide written by Atlassian to fix the problem. It's easy.

Questions?

Ask us any time.