Arul's Oracle Zone

~my references and inferences of Oracle, MySQL, GoldenGate

Configuring swap and tmpdir for MySQL 5.1 in Linux

Posted by Arul Ramachandran on February 9, 2012

1) Operating System
Red Hat Enterprise Linux Server release 5.4 (Tikanga)

2) Architecture
DL580 G7 128GB RAM

3) MySQL Server version
mysql Ver 14.14 Distrib 5.1.60, for unknown-linux-gnu (x86_64) using EditLine wrapper

$ free -m
total used free shared buffers cached
Mem: 128989 21528 107461 0 286 19360
-/+ buffers/cache: 1881 127108
Swap: 4095 0 4095

$ cat /proc/swaps
Filename Type Size Used Priority
/dev/mapper/rootvg-swaplv1 partition 2097144 0 -1
/dev/mapper/rootvg-swaplv2 partition 2097144 0 -2

SWAP:

Swap space on linux can be managed and inspected with the following commands:

Check swap space & usage:

In this case 4GB of swap configured, with none in use.

Generally speaking on a database server you do not want to use ANY swap, at all. Reason being that the database can much more effectively manage the caches in memory than the operating system caches since it has deeper knowledge of the data structure. You can check your memory usage as follows:

Total – Used + Cached

Example:

12GB total, 11GB used, 4GB cache.

12 – 11 + 4 = 5GB free RAM

The reason we add the cache data back in, is that this cache value is RAM used by Linux to cache disk data in memory. If an application requires the memory, it is immediately purged and handed to the application since it was only a cache and simply re-read from disk if needed at a later time. By default Linux will use up as much RAM as disk cache as possible, but just throw it away if the RAM is needed elsewhere. While you can modify this behaviour, 99% of the time it works perfectly fine.

This may even result in the system swapping out some un-used RAM in preference to the OS cache, that is also OK as most of the time the RAM is dead that is allocated by some application but has not been used.

Check and/or modify devices used for swap:

the “cat /proc/swaps” command will list the devices in use for swap and their size.

If you wish to add another device to swap, there are three steps

(1) Prepare the mount point with the “mkswap” command, this will format it as a swap partition, much like any filesystem i.e. ext3, etc.

mkswap /dev/path

(2) Add the item to /etc/fstab so that it is added as a swap device on system startup

(3) Unless you reboot, the /etc/fstab entry is not automatically added, so we can save a reboot and add it at runtime as follows:

swapon /dev/path

If for some reason you need to remove a swap device at runtime, the command is “swapoff” — you may also need to remove it from /etc/fstab

Swap Space Sizing

Since as I discussed, we do not want to use SWAP space on a database server, it is not important to have a lot. However, it can help to have at least some swap (1-4GB) so that any application with “dead” / leaked memory that is not in use, can have it swapped out to disk rather than use up RAM.

However in theory you could run fine with no swap configured at all.

You should configuring your monitoring & alert systems to raise an alert if pages swapped in/out is excessive – as you may need to reconfigure your MySQL memory settings to correct the issue.

TMPDIR:

The default location for the temporary directory is /tmp in most cases. This area is almost always setup correctly on Linux machines.

There are two main concerns with /tmp

(1) Some distributions and systems setup a RAMDISK for /tmp backed by swap, you can note this from the filesystem type being “tmpfs”. This behaviour is not necessarily bad but you need to be aware of it.
(2) Make sure enough free space is available. The only use MySQL has here is in the event it has to create a temporary table to satisfy the exection of some queries. In many cases if they are created they are very small (10s of megabytes) — but it is possible your queries could generate tables that are hundred of megabytes.

In the event the /tmp does fill up, MySQL will simply abort the query and delete it. So you are unlikely to be left with a full drive for very long.

Posted in Uncategorized | Leave a Comment »

MySQL: how to delete duplicates from a table

Posted by Arul Ramachandran on January 4, 2012

To show an example of this, let us create a table called DUPS with duplicate records:

mysql> create table temp ( id smallint unsigned not null auto_increment primary key, table_schema varchar(64), table_name varchar(64) );
Query OK, 0 rows affected (0.02 sec)

mysql> desc temp;
+--------------+----------------------+------+-----+---------+----------------+
| Field        | Type                 | Null | Key | Default | Extra          |
+--------------+----------------------+------+-----+---------+----------------+
| id           | smallint(5) unsigned | NO   | PRI | NULL    | auto_increment | 
| table_schema | varchar(64)          | YES  |     | NULL    |                | 
| table_name   | varchar(64)          | YES  |     | NULL    |                | 
+--------------+----------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

mysql> insert into temp ( table_schema, table_name ) select table_schema, table_name from information_schema.tables;
Query OK, 89 rows affected (0.01 sec)
Records: 89  Duplicates: 0  Warnings: 0

mysql> select max(id), count(*) from temp;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |       89 | 
+---------+----------+
1 row in set (0.00 sec)

let us create a new table that will contain some duplicate records:

mysql> create table dups as select * from temp ;
Query OK, 89 rows affected (0.03 sec)
Records: 89  Duplicates: 0  Warnings: 0

mysql> desc dups;
+--------------+----------------------+------+-----+---------+-------+
| Field        | Type                 | Null | Key | Default | Extra |
+--------------+----------------------+------+-----+---------+-------+
| id           | smallint(5) unsigned | NO   |     | 0       |       | 
| table_schema | varchar(64)          | YES  |     | NULL    |       | 
| table_name   | varchar(64)          | YES  |     | NULL    |       | 
+--------------+----------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

let us insert 50 duplicates records:

mysql> insert into dups select * from temp limit 50 ;
Query OK, 50 rows affected (0.00 sec)
Records: 50  Duplicates: 0  Warnings: 0

mysql> select max(id), count(*) from dups ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |      139 | 
+---------+----------+
1 row in set (0.00 sec)

There are a few options to delete duplicate records from DUPS table:

Option 1:

create a new temporary table without duplicates and rename this table to original table:

mysql> create table dups_temp as select * from dups group by table_schema, table_name ;
Query OK, 89 rows affected (0.04 sec)
Records: 89  Duplicates: 0  Warnings: 0

mysql> select max(id), count(*) from dups_temp ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |       89 | 
+---------+----------+
1 row in set (0.00 sec)

mysql> drop table dups;
Query OK, 0 rows affected (0.12 sec)

mysql> rename table dups_temp to dups;
Query OK, 0 rows affected (0.02 sec)

Option 2:

Create an unique index (with IGNORE clause) on columns table_schema, table_name eliminates duplicates:

mysql> select max(id), count(*) from dups ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |      139 | 
+---------+----------+
1 row in set (0.00 sec)

mysql> alter ignore table dups add unique index (table_schema, table_name);
Query OK, 139 rows affected (0.14 sec)
Records: 139  Duplicates: 50  Warnings: 0

mysql> select max(id), count(*) from dups ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |       89 | 
+---------+----------+
1 row in set (0.00 sec)

mysql> show indexes from dups\G
*************************** 1. row ***************************
       Table: dups
  Non_unique: 0
    Key_name: table_schema
Seq_in_index: 1
 Column_name: table_schema
   Collation: A
 Cardinality: 6
    Sub_part: NULL
      Packed: NULL
        Null: YES
  Index_type: BTREE
     Comment: 
*************************** 2. row ***************************
       Table: dups
  Non_unique: 0
    Key_name: table_schema
Seq_in_index: 2
 Column_name: table_name
   Collation: A
 Cardinality: 89
    Sub_part: NULL
      Packed: NULL
        Null: YES
  Index_type: BTREE
     Comment: 
2 rows in set (0.00 sec)

mysql> alter table dups drop index table_schema ;
Query OK, 89 rows affected (0.12 sec)
Records: 89  Duplicates: 0  Warnings: 0

Option 3:

In this method we will use a DELETE statement; also we would need to create a column to serve as unique identifier if one is not already present:

mysql> alter table dups add ( dummy_id smallint unsigned not null auto_increment primary key );
Query OK, 139 rows affected (0.11 sec)
Records: 139  Duplicates: 0  Warnings: 0

mysql> select min(dummy_id), max(dummy_id) from dups ;
+---------------+---------------+
| min(dummy_id) | max(dummy_id) |
+---------------+---------------+
|             1 |           139 | 
+---------------+---------------+
1 row in set (0.00 sec)

mysql> select max(id), count(*) from dups ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |      139 | 
+---------+----------+
1 row in set (0.00 sec)


mysql> delete from dups using dups, dups as vtable where (dups.dummy_id > vtable.dummy_id)
    -> and (dups.table_schema = vtable.table_schema)
    -> and (dups.table_name = vtable.table_name);
Query OK, 50 rows affected (0.01 sec)

mysql> select max(id), count(*) from dups ;
+---------+----------+
| max(id) | count(*) |
+---------+----------+
|      89 |       89 | 
+---------+----------+
1 row in set (0.00 sec)

mysql> alter table dups drop column dummy_id ;
Query OK, 89 rows affected (0.15 sec)
Records: 89  Duplicates: 0  Warnings: 0

Posted in Uncategorized | Leave a Comment »

Query to get the undocumented parameters and their values

Posted by Arul Ramachandran on January 4, 2008

I am posting this to have a handy reference, as x$ tables are greek to me 🙂 and I am not aware of another way to do this other than query the x$ tables.

Here is a query to display all hidden or undocumented parameters and their default values.

SELECT
a.ksppinm "Parameter",
b.ksppstvl "Session Value",
c.ksppstvl "Instance Value"
FROM
x$ksppi a,
x$ksppcv b,
x$ksppsv c
WHERE
a.indx = b.indx
AND
a.indx = c.indx
AND
a.ksppinm LIKE '/_%' escape '/'
/

When you run this query, expect to see hundreds of rows where the undocumented parameters start with an “_”. Hence you’d probably want to filter this by providing the parameter of interest in the LIKE clause of the query.

Posted in Tuning, Utilities | Leave a Comment »

Oracle 10g RAC Cluster interconnects

Posted by Arul Ramachandran on January 3, 2008

Yet another post on … say, you’ve just joined a new client, you are in a new RAC environment and you are trying to figure out the different components of the cluster.

This time it is on the Cluster private interconnects.

The private interconnects used in a RAC are typically configured during the Clusterware install and would be registered in OCR. However this default config could thereafter be overrided using cluster_interconnects initialization parameter (not recommended though ) for various reasons.

In order to troubleshoot interconnect related issues, a starting point would be to find out what interconnects were originally configured and/or what interconnects are presently in use by the cluster. Here are a few ways to do this.

1. Using the dynamic view gv$cluster_interconnects:

select * from gv$cluster_interconnects ;

INST_ID    NAME            IP_ADDRESS       IS_ SOURCE
---------- --------------- ---------------- --- -------------------------------
1          eth0            192.168.10.1     NO  Oracle Cluster Repository
1          eth1            192.168.11.1     NO  Oracle Cluster Repository
2          eth0            192.168.10.2     NO  Oracle Cluster Repository
2          eth1            192.168.11.2     NO  Oracle Cluster Repository

In the above output, the column SOURCE indicates where the private interconnect info was derived from; this column could be one of OCR, OS dependent software or cluster_interconnects parameter.

2. Using the clusterware command oifcfg:

$oifcfg getif
eth2 10.104.95.0 global public
eth0 192.168.10.0 global cluster_interconnect
eth1 192.168.11.0 global cluster_interconnect

3. Using oradebug ipc:

sqlplus “/ as sysdba”

SQL>oradebug setmypid
Statement processed.
SQL>oradebug ipc
Information written to trace file.

The above command would dump a trace to user_dump_dest. The last few lines of the trace would indicate the IP of the cluster interconnect. Below is a sample output of those lines.

From the trace file on node1:

SSKGXPT 0x5edf558 flags SSKGXPT_READPENDING socket no 9 IP 192.168.11.1 UDP 18852

From the trace file on node2:

SSKGXPT 0x5edf558 flags SSKGXPT_READPENDING socket no 9 IP 192.168.10.2 UDP 38967

Posted in RAC | Tagged: | 3 Comments »

Oracle RAC Clusterware: Voting disks

Posted by Arul Ramachandran on December 28, 2007

Say, you’ve just joined a new client, you are in a new RAC environment and you are trying to figure out the different components of the cluster.

The voting disk is one such component in 10g RAC that is used to store the hearbeat information between nodes. Like OCR, the voting disk also is a shared disk component that is accessed by the nodes during the cluster operation. If a node is unable to ping the voting disk, the cluster immediately detects a communication failure and the node is evicted from the cluster to keep the cluster healthy.

crsctl command can be used to check out the voting disk info of a cluster. Here is the usage of crsctl to determine the voting disks.

$ crsctl query css votedisk
0. 0 /dev/vx/rdsk/ocrdg/vdvol00
1. 0 /dev/vx/rdsk/ocrdg/vdvol01
2. 0 /dev/vx/rdsk/ocrdg/vdvol02

Posted in RAC | 1 Comment »

Oracle RAC Clusterware: OCR Disks

Posted by Arul Ramachandran on December 28, 2007

Say, you’ve just joined a new client, you are in a new RAC environment and you are trying to figure out the different components of the cluster.

Oracle Cluster Registry (OCR) is one such component in 10g RAC used to store the cluster configuration information. It is a shared disk component, typically located in a shared raw volume that must be accessible to all nodes in the cluster. The daemon OCSSd manages the configuration info in OCR and maintains the changes to cluster in the registry.

ocrcheck is the command to check out the OCR. Here is a sample output of ocrcheck.

$ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 1048296
Used space (kbytes) : 5116
Available space (kbytes) : 1043180
ID : 834229908
Device/File Name : /dev/vx/rdsk/ocrdg/ocrvol
Device/File integrity check succeeded
Device/File Name : /dev/vx/rdsk/ocrdg/ocrvol2
Device/File integrity check succeeded

Cluster registry integrity check succeeded

The file /etc/oracle/ocr.loc can also be used for this.

$cat /etc/oracle/ocr.loc
ocrconfig_loc=/dev/vx/rdsk/ocrdg/ocrvol
ocrmirrorconfig_loc=/dev/vx/rdsk/ocrdg/ocrvol2
local_only=FALSE

Posted in RAC | 6 Comments »

Oracle Undo size: 10g 10.2.0.3 vs 9i 9.2.0.6

Posted by Arul Ramachandran on December 19, 2007

Does 10g R2 generate more undo than 9i R2?

While trying to inspect the undo details for an UPDATE statement using v$transaction in my previous post (this was tested in Oracle 10.2.0.3) , I found something strange. While the UPDATE statement updated only 100,000 rows, the column USED_UREC from v$transaction displayed 200,000 rows. I was curious to see the behavior in 9i.

Here is what I saw from v$transaction in 9i:

SQL_TEXT                                    RECORDS     BLOCKS     KBYTES
---------------------------------------- ---------- ---------- ----------
update t1 set name = rpad('y', 30, 'y' )     100000       1516      12128

RECORDS field is the alias for column USED_UREC and it shows only 100,000 records. BLOCKS field is the alias for column USED_UBLK shows 1516 blocks, which is about half of what [3078 blocks] I saw in 10G (see previous post). Does this mean 10GR2 generates twice as much undo?

As a next step, I did a tkprof for both tests.

On 9.2.0.6, tkprof shows the following:

update t1 set name = rpad('y', 30, 'y' )       

call     count       cpu    elapsed    disk      query    current        rows
------- ------  -------- ---------- ------- ---------- ----------  ----------
Parse        1      0.00       0.00       0          0          0           0
Execute      1      1.23       4.84       0        675     103151      100000
Fetch        0      0.00       0.00       0          0          0           0
------- ------  -------- ---------- ------- ---------- ----------  ----------
total        2      1.23       4.84       0        675     103151      100000

There is nothing unusual here.

On 10.2.0.3, tkprof shows the following:

update t1 set name = rpad('a', 30, 'a' )                 

call     count       cpu    elapsed    disk      query    current        rows
------- ------  -------- ---------- ------- ---------- ----------  ----------
Parse        1      0.00       0.00       0          0          0           0
Execute      1      2.07       2.72       0        724     205112      100000
Fetch        0      0.00       0.00       0          0          0           0
------- ------  -------- ---------- ------- ---------- ----------  ----------
total        2      2.07       2.72       0        724     205112      100000

Here the value of current mode gets is twice that of 9.2.0.6. Why so?

For both cases 9i and 10g, table T1 did not have any indexes.

Digging further to check if the recursive sql’s had any effect, from the tkprof and the trace, notice another difference:

On 9.2.0.6

select file#
from
 file$ where ts#=:1                 

call     count       cpu    elapsed    disk      query    current        rows
------- ------  -------- ---------- ------- ---------- ----------  ----------
Parse       12      0.00       0.00       0          0          0           0
Execute     12      0.00       0.00       0          0          0           0
Fetch       36      0.00       0.00       0         48          0          24
------- ------  -------- ---------- ------- ---------- ----------  ----------
total       60      0.00       0.00       0         48          0          24       

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: SYS   (recursive depth: 1)                 

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  TABLE ACCESS BY INDEX ROWID FILE$
      2   INDEX RANGE SCAN I_FILE2 (object id 42)

On 10.2.0.3

select file#
from
 file$ where ts#=:1                 

call     count       cpu    elapsed    disk      query    current        rows
------- ------  -------- ---------- ------- ---------- ----------  ----------
Parse       23      0.00       0.00       0          0          0           0
Execute     23      0.00       0.00       0          0          0           0
Fetch       46      0.00       0.00       0         92          0          23
------- ------  -------- ---------- ------- ---------- ----------  ----------
total       92      0.00       0.00       0         92          0          23       

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 1)                 

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS FULL FILE$ (cr=4 pr=0 pw=0 time=44 us)

Note the number of executions in 10G has doubled for this sql and it does a table scan on FILE$.

Does the change in the access path of this recursive SQL lend to the increase in the current mode gets? and does that impact the undo size? at this point, I am really puzzled 😦

Posted in 10G, Tuning, UNDO | 3 Comments »

How much UNDO does a SQL use?

Posted by Arul Ramachandran on December 8, 2007

If you wish to know how much UNDO does a DML statement use, here is a method.

First, let’s create a test table to populate 100,000 rows.

SQL> create table t1 ( id number(6), name varchar2(30) );

Table created.

SQL> create sequence seq;

Sequence created.

SQL> insert into t1 select seq.nextval, rpad(‘x’,30, ‘x’) from test where rownum < 100001;

100000 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from t1;

COUNT(*)
———-
100000

The dynamic view v$transaction provides details of active transactions in the database ; the column USED_UBLK provides the number of undo blocks consumed by a transaction; the column USED_UREC provides the number of undo records used.

col sql_text format a40
set lines 130
select sq.sql_text sql_text, t.USED_UREC Records, t.USED_UBLK Blocks, (t.USED_UBLK*8192/1024) KBytes from v$transaction t,
v$session s,
v$sql sq
where t.addr = s.taddr
and s.sql_id = sq.sql_id
and s.username = ‘<user>’
/

Let’s update the table using

SQL> update t1 set name = rpad(‘y’, 30, ‘y’) ;

100000 rows updated.

and run the v$ script from another window to get the output below:

SQL_TEXT                                    RECORDS     BLOCKS     KBYTES
---------------------------------------- ---------- ---------- ----------
update t1 set name = rpad('y', 30, 'y')      200000       3078      24624

Using v$transaction provides an insight to the active transactions on the system and the resources used by those transactions.

This tells the update dml used 3078 blocks or 24Mb of undo. But, look at USED_UREC column – it shows 200,000 records when I updated only 100,000 records.. this seems strange… this is not what I’ve seen before.. I need to check more on this .. stay tuned…

Posted in UNDO, Utilities | 3 Comments »

Weird RAC issue due to an incorrect setup

Posted by Arul Ramachandran on November 18, 2007

When setting up a 4-node Oracle 10.2.0.2 RAC on RHEL4, after installing Oracle Clusterware, Oracle RAC binaries and creating a clustered database, I went ahead to open the instance on the first node and and then the second node. The instance on the first node would come up without any problem, but the instance on the second node would take a while to come up. However, on further checking we realized that the first instance was down. It turned out that while the second instance was coming up the first instance would crash. It all seemed weird because one would think both instances were healthy, but just the fact of bringing up an instance would crash the already open instance.

The alert log had the following messages:

Interface type 1 eth0 192.168.10.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth3 192.168.11.0 configured from OCR for use as a cluster interconnect
WARNING 192.168.11.0 could not be translated to a network address error 1
Interface type 1 eth2 1#.1##.145.0 configured from OCR for use as a public interface

I went ahead to check if both interconnects are healthy.

A ping command on the primary interconnect revealed it was fine.

$ping <Primary-NIC>
PING Primary-NIC (192.168.10.2) 56(84) bytes of data.
64 bytes from Primary-NIC (192.168.10.2): icmp_seq=0 ttl=64 time=0.035 ms
64 bytes from Primary-NIC (192.168.10.2): icmp_seq=1 ttl=64 time=0.045 ms

— Primary-NIC ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.035/0.040/0.045/0.005 ms, pipe 2

A ping command on the secondary interconnect indicated there was a problem.

$ping <Secondary-NIC>
PING Secondary-NIC (192.168.11.2) 56(84) bytes of data.

— Secondary-NIC ping statistics —
28 packets transmitted, 0 received, 100% packet loss, time 27018ms

Once the SysAdmin fixed the secondary interconnect, both instances came up without any issues. This reinforces the importance of a functioning primary and secondary interconnect, if you specify both interconnects during the install – however, this does not provide redundancy or the ability to failover to the good interconnect if one interconnect fails.

The setup did not use NIC bonding, instead it used a primary and secondary interconnect for private interface (secondary interconnect for failover capability). I am positive both NIC cards should have been working during the install as the install pre-requisites were all met. But, it is evident from the alert log messages the second interconnect had some trouble.

Multiple interconnects allow Cache Fusion traffic to be distributed on all interconnects. However, if any one of the interconnects does not function, Oracle will assume the private network is down and will not open in cluster mode. Therefore it is highly recommended to use NIC bonding at the OS level for NIC failover and traffic sharing capability.

Bottom line: this incident clearly indicates the configuration of the interconnect plays a crucial role in RAC. Should I even make this statement? 🙂

P.S: for the dirty details of setting up NIC bonding in Linux check out this excellent link by Vivek.

Posted in RAC | Tagged: , | 1 Comment »

RAC Hangs due to small cache size on SYS.AUDSES$

Posted by Arul Ramachandran on November 15, 2007

If you run Oracle RAC on 10.2.0.2 or 9i watch out for the following.

I’ve seen on a certain occasions when there is heavy system activity, all instances in RAC hang, even connecting as sysdba was terribly slow. It turned out this was due to the sequence SYS.AUDSES$ has its cache setting at default value of 20.  Metalink note: 395314.1 mentions that the some of the symptoms are:

– Checkpoint not completing on all RAC nodes
– Waits expire on row cache enqueue lock dc_sequences
– RAC hangs due to QMON deadlocking

It also mentions the fix to increase the cache size to a large value.

alter sequence sys.audses$ cache 10000;

Apparently this is fixed in 10.2.0.3.

Posted in RAC | Tagged: | Leave a Comment »