Backup job fails

Issue

Backup jobs fail with ORACLE_DTC 20 error. You get the following error message:

Failed to authenticate database credentials [Failed to validate db creds. SP2-1502: The HTTP proxy server specified by http_proxy is not accessible]

Cause

If you have File Server, NAS, and Oracle on the same machine with web proxies other than HTTP configured in the config file, then Oracle jobs fail after upgrading to the Oracle agent version 7.0.0-419591. For example, if you have File Server client configured with socks4, and Oracle with HTTP on the same machine, and you upgrade your Oracle agent to version 7.0.0-419591, then the File Server configuration takes precedence, and since Oracle supports only HTTP proxy, oracle jobs fail after upgrading.

Resolution

Configure proxy to HTTP and retrigger backup job.

Point in Time restore to an alternate server fails

Issue

Point in Time restore to an alternate server fails with ORACLE_DTC 5 error. You get the following error message:

archived log file name=/u03/app/oracle/flash_recovery_area/SANDBOX/archivelog/2023_01_26/o1_mf_1_164_kx5xd3wj_.arc thread=1 sequence=164
channel default: deleting archived log(s)
archived log file
name=/u03/app/oracle/flash_recovery_area/SANDBOX/archivelog/2023_01_26/o1_mf_1_164_kx5xd3wj_.arc RECID=186 STAMP=1127144995
unable to find archived log
archived log thread=1 sequence=165
released channel: ch0
released channel: ch1
released channel: ch2
released channel: ch3
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 01/26/2023 15:49:56
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 165 and starting SCN of 1504105

Cause

You get this error because additional archive logs are generated when the PIT is created. Because of this, when you perform restore within the selected timestamp, these logs are unavailable for restoration and you can get them only in the next PIT.

Resolution

For PIT restore, select a timestamp a few seconds after the desired time. For example, if you want to perform PIT restore at 11:20 AM, select the timestamp as 11:21:05 AM.

Point in Time restore to a standby database fails

Issue

Point in Time restore to an alternate server fails with ORACLE_DTC 5 error. You get the following error message in RMAN logs:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/home/oracle/oracle_base/oradata/STBY/datafile/o1_mf_system_l3hb87d5_.dbf'

Cause

You get this error because RMAN could not apply the logs correctly from the selected point-in-time restore point being restored.

Resolution

For PIT restore, restore to the primary database and then sync the database again with the standby database.

Failed to restore the database to an alternate server

You might encounter the following issues while restoring the database to an alternate server:

Issue

You get the following error message:

ORA-28365: wallet is not open

Cause

You may get this error if:

The wallet path is set incorrectly.
The wallet status is not OPEN.
The user is not able to access the wallet directory.

Resolution

Perform the following steps:

Check the wallet status through the following query:

select * from V$ENCRYPTION_WALLET;

2. Check if the WRL_PARAMETER field has the correct wallet path, and the STATUS field is set to OPEN.

3. Make sure the user, group, and their respective permissions are the same on the source and the destination server and the user on the destination server has access to the wallet location.

Issue

You get the following error message:

ORA-27086: unable to lock file - already in use
Linux-x86_64 Error: 11: Resource temporarily unavailable

Cause

A database is already running on the restore location and using the data file.

Resolution

Perform either of the following actions:

Use different location for restore.
Shutdown the database running on the restore location. This will overwrite the existing database.
Drop the database running on the restore location. This will remove the files used by the existing database.

Issue

You get the following error message:

ORA-19554: error allocating device, device type: SBT_TAPE, device name: \nORA-27211: Failed to load Media Management Library

Cause

You may get this error for multiple reasons. One cause could be that the DB user on the source and destination servers is different.

Resolution

Use the same DB user on the source and destination server to restore the database to the alternate server.

Issue

You get the following error message:

ORA-01261: Parameter db_create_file_dest destination string cannot be translated\nORA-01262: Stat failed on a file destination directory\nLinux-x86_64 Error: 2: No such file or directory\nSQL> Disconnected\n"

Cause

You get this error when the spfile is restored, but the paths of the following files do not exist on the target server:

control_files
db_recovery_file_dest
archive_log_dest
Audit_file_dest
db_create_file_dest

Resolution

Create the required directory structure on the target server and make sure the oracle dba user has access to it.

Issue

You get the following error message:

ORA-27125: unable to create shared memory segment\nLinux-x86_64 Error: 28: No space left on device\

Cause

You get this error message when you want to restore the spfile along with the database, and the SGA or PGA memory available to start the database is insufficient.

Resolution

Free up the required memory or reduce the SGA and PGA memory in the spfile of the database.

Issue

You get the following error message:

ORA-01017: invalid username/password; logon denied

Cause

You get this error when either wallet is not configured or configured incorrectly on the target server. Additionally, if you provide incorrect credentials of the target database, the restore fails with this error.

Resolution

Make sure that you configure the wallet correctly on the target server and provide correct database credentials.

Issue

You get the following error message:

NID-00111: Oracle error reported from target database while executing
begin       dbms_backup_restore.nidprocessdf(:fno, :istemp, :skipped, :idchged,                                      :nmchged);    end;

ORA-20000: File /home/oracle/oracle_base/oradata/CDB01/datafile/o1_mf_temp_knf4c73o_.tmp has wrong dbid or dbname, remove or restore the offending file.

Cause

When you retrigger restore on an already restored database, the name of the database being restored gets updated in the temp file. However, the temp file name remains the same as the earlier restore. This causes conflict, and the restore fails.

Resolution

Delete the temp file and retrigger restore.

Issue

You get the following error message:

ORA-27072: File I/O error

Linux-x86_64 Error: 28: No space left on device

Cause

The available memory is not sufficient to restore the database.

Resolution

Make sure enough space is present on the target server for restoring the database.

Issue

You get the following error message:

Failed to authenticate database credentials [Failed to fetch authentication details]

Cause

The Oracle Home and Oracle Base provided is incorrect or does not exist on the target server.

Resolution

Provide correct Oracle Home and Oracle Base and retrigger restore.

Issue

You get the following error message:

RMAN-03002: failure of recover command at 10/12/2022 05:41:06

ORA-19698: /home/oracle/oracle_base/oradata/NCDB_19C/onlinelog/o1_mf_1_kklyyd23_.log is from different database: id=3039599149, db_name=CLN1

Cause

When you restore a database, the log file contains logs of that database. When you retrigger restore of the same database, the log file contains the logs of the earlier restore, and the name of the log file also remains the same as that of the earlier restore. This causes conflict, and the restore fails.

Resolution

Drop the database that uses the log files using the following commands and retrigger restore:

shutdown abort;
startup mount exclusive restrict;
drop database;exit

Issue

You get the following error message:

ORA-19504: failed to create file "+DATA/MHRACDB/DATAFILE/tbspace1.dbf"

ORA-17502: ksfdcre:3 Failed to create file +DATA/MHRACDB/DATAFILE/tbspace1.dbf

ORA-15001: diskgroup "DATA" does not exist or is not mounted

Cause

You get this error if the control file and spfile created in the earlier restore are not deleted and the database is running on the ASM.

Resolution

Perform the cleanup activity by deleting the old control file and spfile of the database created in the last restore. You can find these files in $ORACLE_HOME/dbs/ with the names cntrl<dbname>.dbf, init<dbname>.ora, and spfile<dbname>.ora respectively.
To restore the database without performing the cleanup activity, configure ASM on the target server with the same configurations as that of the source database.

Issue

You get the following error message:

level=debug ts=2022-11-15T21:08:05.997485224+05:30 filename=cmd.go:162 message="stdout: cmd: RMAN-00571: ===========================================================" Layer=OracleApiUtil

level=debug ts=2022-11-15T21:08:05.997514404+05:30 filename=cmd.go:162 message="stdout: cmd: RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============" Layer=OracleApiUtil

level=debug ts=2022-11-15T21:08:05.997529214+05:30 filename=cmd.go:162 message="stdout: cmd: RMAN-00571: ===========================================================" Layer=OracleApiUtil

level=debug ts=2022-11-15T21:08:05.99754723+05:30 filename=cmd.go:162 message="stdout: cmd: RMAN-03002: failure of sql statement command at 11/15/2022 21:08:05" Layer=OracleApiUtil

level=debug ts=2022-11-15T21:08:05.997561437+05:30 filename=cmd.go:162 message="stdout: cmd: ORA-01103: database name 'ORCLDB' in control file is not 'ORCL_DB'" Layer=OracleApiUtil

Cause

You get this error if the SID and the restored database name are different, and spfile is not selected for restore.

Resolution

Select the Restore SP file field and retrigger restore.

Issue

You get the following error message:

level=debug ts=2022-12-12T06:43:11.145339629Z filename=cmd.go:162 message="stdout: cmd: RMAN-00571: ===========================================================" Layer=OracleApiUtil
level=debug ts=2022-12-12T06:43:11.14537176Z filename=cmd.go:162 message="stdout: cmd: RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============" Layer=OracleApiUtil
level=debug ts=2022-12-12T06:43:11.145387446Z filename=cmd.go:162 message="stdout: cmd: RMAN-00571: ===========================================================" Layer=OracleApiUtil
level=debug ts=2022-12-12T06:43:11.145431387Z filename=cmd.go:162 message="stdout: cmd: RMAN-03002: failure of sql statement command at 12/12/2022 06:43:11" Layer=OracleApiUtil
level=debug ts=2022-12-12T06:43:11.145447413Z filename=cmd.go:162 message="stdout: cmd: ORA-00723: Initialization parameter COMPATIBLE must be explicitly set" Layer=OracleApiUtil

Cause

You get this error if the compatible parameter value is different in the source and the target database.

Resolution

If the Restore SP File field is selected, make sure that the directory structure mentioned in the SP file exists on the target server. If not, create the directory structure on the target server as mentioned in the SP File.

If the Restore SP File field is not selected, perform the following steps on the target server:

Create the pfile with the following parameters on the destination

database:db_name=<source_db_name>
undo_management=AUTO
db_create_file_dest='<restore location/source_db_name>'
compatible=<As per source DB> Sample config:db_name=olddbundo_management=AUTO
db_create_file_dest='/home/oracle/temp_olddb/olddb'
compatible=18.0.0
EXPORT ORACLE_SID= <source_db_name>.
Connect to the database using the sqlplus command.
Shutdown the source database using the shutdown abort command on the target server.
Start the destination server instance with the pfile created in the Step 1 by running the following command:
startup nomount pfile=' <pfile_path_created_in_step_1> '
Run the following command:
create spfile from pfile=' <pfile_path_created_in_step_1> '
Trigger restore to the alternate server from the #######{{phoenixconsole}}.

Select all query on DBA tables fails

Issue

After restoring the database to an alternate location, the select all query to DBA tables fail and you get the following error:

ORA-25153: Temporary Tablespace is Empty

Cause

You get this error if you do not add the temporary file to the temporary tablespace.

Resolution

Add the temporary file to the temporary tablespace by using the following command

alter tablespace <temp tablespace name> add tempfile '<datafile_path>' size <size> reuse autoextend on next <size> maxsize unlimited;

For example:

alter tablespace TEMP add tempfile '/home/oracle/oracle_base/oradata/DB12C/TEMP01.dbf' size 2789212160 reuse autoextend on next 10485760 maxsize unlimited;

Unable to perform large Oracle DTC restore

Issue

The PhoenixIOServer process crashes leading to the failure of a database restore. The /var/log/messages file shows the following messages:

eb-db0 kernel: Out of memory: Kill process 4495 (PhoenixIOServer) scroe 259 or sacrifice child

eb-db0 kernel: Killed process 4495 (PhoenixIOServer) total-vm:7833136kB, anon-rss:6477996kB, file-res:0kB, shmem-rss:0kB

eb-db0 kernel: oom_reaper: reaped process 4495 (PhoenixIOServer), now anon-ree:0kB, file-rss:0kB, shmem-rss:0kB

Cause

One of the reasons could be the PhoenixIOServer process is taking up a lot of memory and is getting killed by the OOM (Out-Of-Memory) reaper.

Resolution

Disable the OOM reaper by running the following command:
sudo -s sysctl -w vm.oom-kill = 0
For more information about enabling or disabling OOM reaper, see How to Adjust Linux Out-Of-Memory Killer Settings.
Reduce the number of worker threads that are used for the restore by setting the value of the DATAMOVER_WORKERS attribute in the /etc/PhoenixOracle/Phoenix.yml file and restart the PhoenixOracle service.

❗ Important

Reducing the number of worker threads can impact restore time.

Failure of the FULL backup job due to missing archive logs

Issue

The FULL backup job fails with the following error:

===========================================================

=============== ERROR MESSAGE STACK FOLLOWS ===============

===========================================================

failure of backup command at 05/20/2022 19:30:17

expected archived log not found, loss of archived log compromises recoverability

error identifying file /oracle/ora12cdb/product/12.1.0/db_1/dbs/arch/1_82426_1074616154.dbf

unable to obtain file status

8Linux-x86_64 Error: 2: No such file or directory

Cause

The backup job might fail due to missing archive logs.

Resolution

Run the crosscheck archivelog all command through RMAN and delete the expired archive log files by running the following command:
delete expired archivelog all;

2. Retrigger FULL backup.

Failed to discover database

The discovery of databases hosted on the Oracle servers that are registered with Druva might fail due to various reasons. See the following table for possible causes and their respective resolutions:

Cause	Resolution
The PhoenixOracle service is not running.	Start the PhoenixOracle service and try again.
The Oracle database instance is offline.	Make sure the database is up and running.
The database name is unavailable in the DBA registry.	Add the database name entry in the DBA registry located at /etc/oratab.
Database authentication is not defined.	Assign database authentication.

Failure of RMAN backup

RMAN backup jobs might fail due to various reasons. See the following table for possible causes and their respective resolutions:

Cause	Resolution
The user specified does not have the SYSDBA/SYSBACKUP privileges to perform RMAN backups	Assign the SYSDBA/SYSBACKUP privileges to the user and try again.
The database is not in the ARCHIVELOG mode.	Enable the Perform offline backup if database is in NOARCHIVELOG mode RMAN setting in the backup policy.
The PhoenixOracle service is not running.	Start the PhoenixOracle service and try again.

Failure of offline RMAN backup

Issue

Offline RMAN backup might fail with the following error:

TNS:listener does not currently know of service requested in connect descriptor

Cause

When the database is in the NOARCHIVELOG mode, the database is shut down and brought to the mount state. When the database is shut down, the TNS is turned offline and therefore TNS does not work in the NOARCHIVELOG mode.

Resolution

Add the listener entry to the static list so that even if the database is down, it can connect. The same is applicable for restore operations as well.

For a standalone database, add the following entry in the listener.ora file located at $ORACLE_HOME/network/admin:SID_LIST_LISTENER=

  (SID_LIST=    (SID_DESC=      (ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/db_1)      (SID_NAME=ORCL))  )

For a RAC database, if the database name is RAC, and the instance id is RAC2, the following entry has to be added in the listener.ora file for each node:

SID_LIST_LISTENER=  (SID_LIST=    (SID_DESC=      (GLOBAL_DBNAME=RAC)      (ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/db_1)      (SID_NAME=RAC2))  )

Failure of #######{{R_point}} restore

Restore databases from #######{{rpoints}} might fail due to the following reasons:

Cause	Resolution
Insufficient space on the Oracle server host on which the #######{{rpoint}} restore is requested.	Make sure the Oracle server host has the required space.
RMAN does not have enough permissions to read downloaded backup data.	Check the RMAN logs and give the required permissions to the backup folder.
An invalid path is specified in the Restore Location field on the Restore Target page.	Make sure you provide the correct path.

Oracle incremental backups get converted to full backups

Issue

When an incremental backup is triggered, the #######{{phoenixconsole}} requests information about databases that are updated. #######{{phoenix}} backs up the changed data and creates a #######{{rpoint}} in your storage. During the incremental backup, the changed blocks (delta) from the data files are backed up, but the log files are skipped.

See the following table for the possible causes of incremental backups getting converted to full backups and their respective workarounds:

Cause	Resolution
The incarnation of the database, which is being backed up changes.	Perform a full backup to initiate a new archive log chain with the latest incarnation.
The archived log sequences are missing after the last backup.	Perform a full backup to initiate a new archive log chain.
An incremental backup of a database for which a full backup is not complete is attempted.	Perform a full backup to create a baseline for the next incremental backup.
Inconsistencies between the uploaded database files and their metadata are detected. This might happen if the backup information is modified or deleted from the control file externally, maybe via any script, and not by the #######{{phoenixagent}}.	Trigger new backup from the #######{{phoenixconsole}}.
If a full backup is executed by the #######{{phoenixagent}} and the second full backup is being executed by a third-party backup tool or an Oracle Server. Then the #######{{phoenixagent}} can convert the next incremental backup to a full backup.	Trigger new backup from the #######{{phoenixconsole}}.
Data corruption occurs during the last backup of a database.	Perform a full backup.

Failed to locate Oracle base

Issue

The discovery for the RAC Databases fail with the following error message:

level=debug ts=2023-03-24T10:19:32.352690135-04:00 filename=srvctl.go:537 message="Srvctl Command Execution" Layer=OracleApi Output="vc03-b0-cluster\n"level=debug ts=2023-03-24T10:19:32.355725393-04:00 filename=srvctl.go:487 message="Executing Srvctl Command" Layer=OracleApi Command="srvctl config database -d HCMCNP"level=debug ts=2023-03-24T10:19:32.355976658-04:00 filename=ora_command_util.go:50 message="Process Details" Layer=OracleApiUtil ProcessID=65473level=info ts=2023-03-24T10:19:32.371412168-04:00 filename=ora_command_util.go:122 message="stderr: cmd: PHOENIX_ORACLE_AGENT_DTC_SRVCTL_SCAN_START"level=debug ts=2023-03-24T10:19:32.371445945-04:00 filename=ora_command_util.go:156 message="stdout: cmd: PHOENIX_ORACLE_AGENT_DTC_SRVCTL_SCAN_START" Layer=OracleApiUtillevel=debug ts=2023-03-24T10:19:32.5743801-04:00 filename=ora_command_util.go:156 message="stdout: cmd: PRCZ-3002 : failed to locate Oracle base" Layer=OracleApiUtillevel=info ts=2023-03-24T10:19:32.579083815-04:00 filename=ora_command_util.go:122 message="stderr: cmd: PHOENIX_ORACLE_AGENT_DTC_SRVCTL_SCAN_END"level=debug ts=2023-03-24T10:19:32.579117693-04:00 filename=ora_command_util.go:156 message="stdout: cmd: PHOENIX_ORACLE_AGENT_DTC_SRVCTL_SCAN_END" Layer=OracleApiUtillevel=debug ts=2023-03-24T10:19:33.581845163-04:00 filename=ora_command_util.go:160 message="Exiting cmdOutput" Layer=OracleApiUtillevel=debug ts=2023-03-24T10:19:33.5818451-04:00 filename=ora_command_util.go:126 message="Exiting CmdStdErr" Layer=OracleApiUtillevel=debug ts=2023-03-24T10:19:33.582063352-04:00 filename=srvctl.go:537 message="Srvctl Command Execution" Layer=OracleApi Output="PRCZ-3002 : failed to locate Oracle base\n"

Cause

This error occurs on the RAC cluster nodes if the Oracle base path is not updated in the oraclebasetab file.

Resolution

Add the Oracle base entry to the oraclebasetab file on all nodes of the RAC cluster.

Troubleshooting Oracle DTC issues

Backup job fails

Point in Time restore to an alternate server fails

Point in Time restore to a standby database fails

Failed to restore the database to an alternate server

Select all query on DBA tables fails

Unable to perform large Oracle DTC restore

Failure of the FULL backup job due to missing archive logs

Failed to discover database

Failure of RMAN backup

Failure of offline RMAN backup

Failure of #######{{R_point}} restore

Oracle incremental backups get converted to full backups

Failed to locate Oracle base