Oracle In-Memory Base-Level (Part 2: How to enable the base level and find candidates for In-Memory Base-Level)

The easiest thing is to enable the In-Memory Base level. There are only two parameters, one needs to set at the spfile/pfile. 

The first is to set the INMEMORY_FORCE parameter to "BASE_LEVEL". The default value is "default", which means, this is the $$$ option of the Enterprise Edition database. The BASE_LEVEL is cost free, but also only included with the Enterprise Edition license.
The good thing - as soon as the database is started with the parameter set to BASE_LEVEL, all features, that are not allowed to use with this level are DISABLED! That means you can't use them and therefore you don't run into any license issues by accidently use things that you don't have licensed (there are enough examples for the database where something can be used "technically", but it is "forbidden" by license). This is why I would suggest to set INMEMORY_FORCE at all EE databases to BASE_LEVEL - even if you don't want to use it, it will prevent anybody from using the $$$ option!

The second parameter to set is the size of the In-Memory Column Store INMEMORY_SIZE. Minimum is 100 MB, maximum is 16 GB. If you try to set it higher, it will not work until you change INMEMORY_FORCE back to "default". 

Remark: After setting these parameters you need to restart the database! (by the way, increasing the INMEMORY_SIZE can be done by alter system scope memory/both; shrinking only with scope spfile and another restart).

To understand, which objects can be candidates, we first must understand, for which scenarios the In-Memory Base-Level can be helpful. 

One of the main things is that In-Memory is absolutely transparent for applications. If there is data missing at the In-Memory Column Store, the traditional Buffer Cache is used to get the data. As there is a compression used for the In-Memory Cache (just another name for In-Memory Column Store), there MAY fit more data in the In-Memory Cache than in the Buffer Cache (I will highlight that at another post). 

Where is the In-Memory Cache helpful for getting a better performance? To get an answer for this question we consult the documentation:

  • Faster queries, scans, joins and aggregates, especially with filters using e.g. <, >, IN, =
  • Get only a few columns from a table/materialized view with a big number of columns
  • Get LOB columns with SQL operators
  • SQLs where bloom filters can be used to eliminate data sets which are not scanned, e.g. small dimension tables which filter large tables, tyically primary key-foreign key relationships
  • Queries with aggregations, like group by, vector group by
  • Queries with contains() in xml or text columns or JSON_textcontains() for json documents

As a first step, ALL objects which do fit in one of the above categories are possible candidates for the IN_MEMORY Cache. That means, nearly all tables, table partitions, materialized views or expressions could be possible candidates (indexes, e.g., can't). For a running application this isn't really helpful.

A bit more lucky are people that do have licensed the Diagnostic- and Tuning Pack for the database. If so, there is a little tool available for download at support note 1965343.1.

The Database In-Memory Advisor was build for the "normal" In-Memory option. It installs some program code and is analyzing the activity of the database. One can specify the maximum in-memory cache size as input. The activity is e.g. analyzed using the Active Session History view (this is why Diagnostics- and Tuning-Packs are needed). 

The HTML report one gets then shows the objects that are candidates for In-Memory, the compression type and the expected improvement of the performance. If one is interested in the In-Memory Advisor please have a look at this document. There is also a really good youtube video existing, which can assist you. Starting at roundabout 27:14 you can see the html output of the advisor. What one must keep in mind: The In-Memory Advisor is using ALL compression modes (also at the script level it generates to populate the objects), but for BASE_LEVEL only "memcompress for query low" is available. A good solution is to run the In-Memory Advisor with the cache size a lot less than which is really available. At least, one gets a good impression about possible candidates.

And if one does not have diagnostics and tuning pack licensed? Do you like to dig into the data dictionary performance view? Yes, there are 3rd-party tools available for this. But no, I wouldn't recommend to use them (at least the ones I have seen yet). This means the most helpful thing is to look at the known performance views like v$sql_plan, v$sql, etc.. Check these views for execution statistics and execution plans and try to get the "worst" statements regarding the following criterias:

  • Find the statements with the highest amount of time spent for the application and find out the objects they are using
  • Take a look at recurrend joins at the statements (as they can be candidates for expressions)
  • Check statements with a lot of I/O (Full Table Scans or a high number of Index blocks) as well as high Buffer Cache reads
  • Separate, highly frequently used table partitions 
  • Frequently used materialized views
  • If a table object is is already populated In-Memory, one can use the "In-Memory Expression Capture" to find hottest expressions.

An example. I have created a (ok, logically really senseless) select at the environment I have created (see Part 1 of the post series) for test. It is running more than 27 minutes and it was easy to find that statement at the v$ views. 

We now return to the criterias above "where is In-Memory helpful" and "finding the worst statements" and do look at the execution plan. 


Marked yellow we do see table accesses full at employees (and this is a huuuuuuuge table with a lot of columns) and a high number of rows selected from locations. Marked blue we do see something that looks good at a first sight for jobs and departments - there is an index UNIQUE scan with an additional table access. Having a look at the select, one can see we are hitting perfectly the criteria "filters like IN or <, >, ...". 

There are also joins (line 5-7) for the different result sets of the subselects. That means there could be a candidate for an "In-Memory expression".

If you do have such kind of execution plans, check the predicate information for each line with a star at the execution plan. At the next picture (showing the predicate information) it is a lot more easy to understand, what the optimizer is doing in the background using the "filter" and "access" criterias. Combined with the objects at the lines of the execution plans we can find that employees is a good candidate (due to line 8 and 4), locations is a good candidate (due to the number of rows and the filter criteria found for line 13), the filter of line 15 shows departments as a good candidate, and so on.

Stay tuned for the next posts where I will show how to populate the IM cache and how the execution plan and the predicate information helps to check the success of populating the objects into the IM cache. In addition, I will show why objects at the IM cache may be bigger in size as they are at the datafile level and which views may be useful for having an eye on the IM cache.

DOAG Conference 2020 - Presentation is online

I have already uploaded the pdf of my presentation "Datenbankupgrades mit Autoupgrade über 2 Nodes" at my Slide Download Center (you can find that on the right side or using this Link). The presentation is in german, but if you have any question, write a comment / contact me.


Autoupgrade.jar throws error at analyze phase - UPG-1316

While testing the newest versions of autoupgrade.jar, I was running in an error (on my MS Windows environment, I haven't tested that for Linux, etc. yet). 

The error itself is shown as upg-1316, unexpected exception error:



I have seen two different causes, one is for checking disk space without using a Fast Recovery Area. Then you get this "Min_Archive_Dest_Size" error:

 
The same problem arises, when autoupgrade is doing a check for the FRA size - "Disk_Space_For_Recovery_Area".

Versions that are affected are 19.9.2 and 19.10.0, the issue does not happen with 19.7.3, 19.7.5 and 19.8.1 at the same database / server.

After investigating the issue with the Development, the issue was fixed within 4 days at the newest development version, which I have tested successfully. I don't know, when this version will be released and with which version number. If you do hit the problem, stay at 19.8.1 or go for a newer version than 19.10.0, as soon as it is released.
 
@Dev: Thank you for the quick fix!

Oracle In-Memory Base-Level (Part 1: How to setup a test environment with example data)

As everyone is now (hopefully) upgrading their Oracle database to 19c as long term release, most of you gain new functionality. I don't want to mention everything new from 11.2.0.4 or twelve-something to 19c, but there is something that was added, not quietly, but really surprisingly to 19c. 

The In-Memory Base-Level was first announced with the Oracle 20c database beta, which was made available at the Oracle cloud at the beginning of this year. Later on Oracle promulgated that this new feature is available in all Enterprise Edition licenses, starting with version 19.8 (there are some exceptions, e.g. this feature is not allowed at 3rd-party clouds, so there is no availability e.g. at AWS or Azure and some Operating Systems will follow later).

The In-Memory feature is nothing new for Enterprise Edition customers which have bought the In-Memory option. It is available since 12.1, was enhanced and developed with a lot new features since then, and was now released with some restrictions (like the 16 GB maximum size of the In-Memory Column Store, more about the features and restrictions I will post at the second part of this In-Memory Base-Level series).

Before we can start to test the In-Memory Base-Level, we do need to setup an environment. For me as consultant, I can't take a dump from an existing database. It would contain most sensitive data on the one hand, on the other hand I would burst the size of the SSD of my laptop. Therefore, I need to setup something, where I can test things, made especially for my test case scenarios. 

As you may want to do that by yourself, I will tell you, how you can setup a simple environment to test the In-Memory Base-Level using the sample HR schema from Oracle. 

The how-to: 

A) Setup the Example Schemas

1.) Setup a 19.8 database on Linux or Windows with your preferred architecture and customize everything, that one can connect with sqlplus to the database (or pluggable database).

2.) Download the sample schema creation scripts from Github

3.) Copy the scripts to a working directory on your test machine, from which you can run them, and extract them.

4.) Unfortunately, the scripts do have "__SUB__CWD__" folder names inside the sqls and dat files. One needs to change that to the working directory. This is possible e.g. with the following perl command at Linux:

perl -p -i.bak -e 's#__SUB__CWD__#'$(pwd)'#g' *.sql */*.sql */*.dat

If you are on Windows, you can either try to use the perl.exe from your %ORACLE_HOME%/perl/bin and replace $(pwd) at the command with your working directory (but don't kill me, if it does not work, I haven't tested that yet) or you can use e.g. Notepad++ (my preferred tool). With Notepad++ you can use "find in files", specify your working directory and what you want to replace inside of the files of the working directory and it's subdirectories. Once the scripts are ready, you can proceed with

5.) the creation of a tablespace for the sample schemas, e.g. example. If you have setup db_create_file_dest it is a simple 

SQL> create tablespace example;

If you don't know, if this parameter is set, you should check that before:

SQL> show parameter db_create_file_dest

N
AME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
db_create_file_dest             string     +DATA

6.) I prefer to set this new tablespace as default, because as I do not only run the script but do a lot more, it could otherwise happen that things may be stored accidently at the SYSTEM tablespace or somewhere else.

SQL> alter database default tablespace example;

Remark: The script which is used to create the sample schemas can also create a tablespace with datafiles, but with PDBs there may be an error and personally, I like to do it before and pass only the tablespace name to the script.

7.) Now one can create the sample schemas (please read the readme.txt about the parameters) by running the mksample script. In my environment (I set all passwords to "oracle") I do run the following:

SQL> @/home/oracle/Downloads/db-sample-schemas-19c/mksample oracle oracle oracle oracle oracle oracle oracle oracle example temp $ORACLE_HOME/demo/schema/log pdb1

After the script was running succesful, one now has the example schemas installed, but the number of rows in there does not allow to test any In-Memory functionality, as everything is far to small.

B) Add data into example schemas

Now we can start to add data into the example schemas. We can do this either manually or by script. E.g. I used the HR schema for my tests, so I begin with adding additional rows into the look-up tables Jobs, Locations and Departments. 

1.) Let's do it with jobs:

SQL> insert into jobs values ('IM_TEST','In-Memory-Tester',14000,240000);
        1 row created.
SQL> insert into jobs values ('GG','Good Guys',14,2400);
        1 row created.
SQL> commit;
        Commit complete.
 
Now we do need some more locations (at my sample there are only 23 rows in it). 

2.) Locations:
First we change the LOCATION_ID and the corresponding sequence to fill the table up a little bit more.

SQL> alter sequence locations_seq maxvalue 1000000000 increment by 1;
SQL> alter table locations modify location_id number(9);

As Locations does not have an index on "STREET_ADDRESS" we want to add the STREET_ADDRESS of the already existing rows and add some random data to the other fields. Logically, this does not make any sense, but for testing purposes that's quite ok.

 declare
  i number;

  begin

    for i in 1..200000 loop

    begin

    insert into locations (location_id, street_address, postal_code, city, state_province, country_id)

          values (locations_seq.nextval,

          (select street_address from (select * from locations order by dbms_random.random) where rownum=1),

          trunc(dbms_random.value(1000,999999)),

          (select city from (select * from locations order by dbms_random.random) where rownum=1),

          '',

         (select country_id from (select * from countries order by dbms_random.random) where rownum=1)

          );

    exception when dup_val_on_index then null;

    end;

  end loop;

end;

/
commit;

One ends up with 200'023 rows, but STREET_ADDRESS does have 23 different entries only and it is not evenly distributed.
 

 
 
3.) Departments
For all these locations, we now need some departments:

SQL> alter sequence departments_seq maxvalue 1000000000 increment by 1;
SQL> alter table departments modify department_id number(9);
SQL> alter table departments modify location_id number(9);
SQL> alter table departments modify department_name varchar(100);

As I want to test the In-Memory function with "like XXX%" values, I just add more departments by insert into as select and add the sequence to the DEPARTMENT_NAME.

 You can run something like this, to get also different location_ids and manager_ids per department:

declare
  i number;
begin
   for i in 1..500 loop
      insert into departments
         select departments_seq.nextval,
                substr(department_name,1,20)||'-'||(to_char(trunc(departments_seq.currval))),
                (select manager_id from (select * from departments order by dbms_random.random) where rownum=1),
                (select location_id from (select * from locations order by dbms_random.random) where rownum=1)
                from departments
                where rownum < 30;
   end loop;
end;

After I have run this (and changed some values also) some times, I have now 560'000 different departments in the table, which looks like:

 

4.)  And last, we add some more employees. To test the compression of the In-Memory store, I have created roundabout 750'000 rows with senseless data (because I just use the dbms_random package to create character strings). One can run this several times to blow up the database even more than I did (due to limited resources on my notebook):

SQL> alter table employees modify department_id number(9);
SQL> alter table employees modify email varchar2(50);

declare
  cursor c1 is select department_id from departments where rownum < 20000;
  begin
    for c1_rec in c1 loop
    begin
  insert into employees (employee_id, first_name, last_name, email, hire_date, job_id, salary, department_id)
          values (employees_seq.nextval,
                  dbms_random.string('a',TRUNC(DBMS_RANDOM.value(5,20))),
                  dbms_random.string('a',TRUNC(DBMS_RANDOM.value(5,20))),
                  dbms_random.string('a',TRUNC(DBMS_RANDOM.value(5,20)))||'@mycompany.com',
                  TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)),
                  'IN-MEM',
                  TRUNC(DBMS_RANDOM.value(1000,100000)),
          c1_rec.department_id);
      exception when dup_val_on_index then null;
      end;
  end loop;
end;
/
 
Like I said, the data itself is really senseless, a sample: 
 

 
One of the test scenarios which is mentionend in the In-Memory documentation is a performance improvement when "selecting only some rows of a table with a higher number of columns".
To allow to do some tests with that, I enhance employees by additional columns and add the data from the original columns (but without creating indexes or constraints) by an update. 
 
SQL> alter table employees add (employee_id_1 number(6), first_name_1 varchar2(20),
 last_name_1 varchar2(25), email_1 varchar2(50), phone_number_1 varchar2(20),
 hire_date_1 date, job_id_1 varchar2(10), salary_1 number(8,2), commission_pct_1 number(2,2),
 manager_id_1 number(6), department_id_1 number(9),
 employee_id_2 number(6), first_name_2 varchar2(20), last_name_2 varchar2(25),
 email_2 varchar2(50), phone_number_2 varchar2(20), hire_date_2 date,
 job_id_2 varchar2(10), salary_2 number(8,2), commission_pct_2 number(2,2),
 manager_id_2 number(6), department_id_2 number(9),
 employee_id_3 number(6), first_name_3 varchar2(20), last_name_3 varchar2(25),
 email_3 varchar2(50), phone_number_3 varchar2(20), hire_date_3 date,
 job_id_3 varchar2(10), salary_3 number(8,2), commission_pct_3 number(2,2),
 manager_id_3 number(6), department_id_3 number(9),
 employee_id_4 number(6), first_name_4 varchar2(20), last_name_4 varchar2(25),
 email_4 varchar2(50), phone_number_4 varchar2(20), hire_date_4 date,
 job_id_4 varchar2(10), salary_4 number(8,2), commission_pct_4 number(2,2),
 manager_id_4 number(6), department_id_4 number(9));

SQL> update employees set
        employee_id_1=employee_id, employee_id_2=employee_id, employee_id_3=employee_id, employee_id_4=employee_id,
        first_name_1=first_name, first_name_2=first_name, first_name_3=first_name, first_name_4=first_name,
        last_name_1=last_name, last_name_2=last_name, last_name_3=last_name, last_name_4=last_name,
        email_1=email, email_2=email, email_3=email, email_4=email,
        hire_date_1=hire_date, hire_date_2=hire_date, hire_date_3=hire_date, hire_date_4=hire_date,
        job_id_1=job_id, job_id_2=job_id, job_id_3=job_id, job_id_4=job_id,
        salary_1=salary, salary_2=salary, salary_3=salary, salary_4=salary,
        department_id_1=department_id, department_id_2=department_id, department_id_3=department_id, department_id_4=department_id;

SQL> commit;

 
That's it. With this data in place we can implement the In-Memory Base level and test some SQLs. I will write a second blog post about that soon.