Monday, March 11, 2013

[how to] Convert the IP Address range to two BIGINT for faster lookup

[how to] Convert the IP Address range to two BIGINT for faster lookup


Convert the IP Address range to two BIGINT for faster lookup

Posted: 11 Mar 2013 10:01 PM PDT

I am working on a project in which we need to do IP Address lookups. So for this we have got the datasets which will look like below and now the next steps is to populate this datasets in to the MySql table. Below is the dataset example containing columns-

ip_address      country region  city           metro-code   latitude    longitude   postal-code   country-code   region-code    city-code   continent-code  country-code    24.32.116.116   usa       tx    clarksville    623           33.6103    -95.0498    75426            840           44             34918            6            us  65.29.108.232   usa       mi    novi           505           42.4637    -83.4604    48375            840           23             4339             6            us  66.169.99.69    usa       tx    ft worth       623           32.7016    -97.3823    76109            840           44             771              6            us  72.70.100.111   usa       ma    peabody        506           42.5332    -70.9726    1960             840           22             1392             6            us  174.252.116.137 usa       va    oakton         511           38.8892    -77.3222    22124            840           47             3860             6            us  68.55.144.242   usa       md    pasadena       512           39.1276    -76.5125    21122            840           21             4358             6            us  174.252.83.27   usa       pa    lancaster      566           40.0459    -76.3542    17603            840           39             333              6            us  174.78.192.31   usa       ga    warner robins  503           32.5958    -83.6384    31088            840           11             5052             6            us  98.192.146.235  usa       fl    panama city    656           30.1804    -85.5598    32404            840           10             671              6            us  71.192.181.20   usa       ma    springfield    543           42.1187    -72.5483    1109             840           22             967              6            us  76.183.54.227   usa       tx    dallas         623           32.7202    -96.6769    75217            840           44             77               6            us  69.62.143.140   usa       ca    granite bay    862           38.7442    -121.191    95746            840           5              49451            6            us  69.142.76.32    usa       nj    penns grove    504           39.707     -75.4467    8069             840           31             2335             6            us  70.173.54.93    usa       nv    las vegas      839           36.2059    -115.225    89108            840           29             173              6            us  98.242.8.222    usa       ca    fresno         866           36.7968    -119.883    93722            840           5              19               6            us  

Problem Statement:-

I am planning to store the START_IP_NUM and END_IP_NUM as BIGINT in the MySql database instead of storing IP Address in the table as doing a lookup against a BIGINT is far faster than searching a string (dotted ip notation).

So my question is we will be having separate Java Program that will populate the above datasets in the MySql table. So I need to device some sort of logic which can convert the above datasets into like below-

start_ip_num  end_ip_num  country   region    city             metro-code    latitude      longitude     postal-code     country-code       region-code   city-code     continent-code    country-code  

I am confuse given a IP Address how should I device start_ip_num and end_ip_num here and store it into the MySql table

Why is this user "Admin" created in all databases?

Posted: 11 Mar 2013 09:40 PM PDT

When I create database in SQL server Management Studio, user with name "admin" is also created for all databases:

enter image description here

Why this user is created? And how I can change this behavior?

IP Address lookup star schema design

Posted: 11 Mar 2013 09:04 PM PDT

I am working on a project in which we need to do IP Address lookups. So for this we have got the datasets and now the next steps is to populate this datasets in to the table. And then have some service setup that will accept IP Address as an input and then lookup the database basis on IP Address. So below is the dataset will look like-

ip_address      country region  city           metro-code   latitude    longitude   postal-code   country-code   region-code    city-code   continent-code  country-code    24.32.116.116   usa       tx    clarksville    623           33.6103    -95.0498    75426            840           44             34918            6            us  65.29.108.232   usa       mi    novi           505           42.4637    -83.4604    48375            840           23             4339             6            us  66.169.99.69    usa       tx    ft worth       623           32.7016    -97.3823    76109            840           44             771              6            us  72.70.100.111   usa       ma    peabody        506           42.5332    -70.9726    1960             840           22             1392             6            us  174.252.116.137 usa       va    oakton         511           38.8892    -77.3222    22124            840           47             3860             6            us  68.55.144.242   usa       md    pasadena       512           39.1276    -76.5125    21122            840           21             4358             6            us  174.252.83.27   usa       pa    lancaster      566           40.0459    -76.3542    17603            840           39             333              6            us  174.78.192.31   usa       ga    warner robins  503           32.5958    -83.6384    31088            840           11             5052             6            us  98.192.146.235  usa       fl    panama city    656           30.1804    -85.5598    32404            840           10             671              6            us  71.192.181.20   usa       ma    springfield    543           42.1187    -72.5483    1109             840           22             967              6            us  76.183.54.227   usa       tx    dallas         623           32.7202    -96.6769    75217            840           44             77               6            us  69.62.143.140   usa       ca    granite bay    862           38.7442    -121.191    95746            840           5              49451            6            us  69.142.76.32    usa       nj    penns grove    504           39.707     -75.4467    8069             840           31             2335             6            us  70.173.54.93    usa       nv    las vegas      839           36.2059    -115.225    89108            840           29             173              6            us  98.242.8.222    usa       ca    fresno         866           36.7968    -119.883    93722            840           5              19               6            us  

Problem Statement:-

We are expecting traffic around 100 Millions calls per day for the IP Address lookup.

And this dataset will be worldwide datasets meaning for all the countries. And we are planning to store these datasets in MySql database which is In Memory Database here.

Now my question is should I create only a single table with the above mentioned columns? Or should I normalize the above table into two or three? And do I need to create an index on any columns if any.?

Basically I am trying to know how should I setup the schema for this table so that lookup doesn't take that much time with the traffic we are going to expect. And our service which will be doing lookups should return the response very fast.

I was going through the star schema so If I need to go forward with Star Schema then how can I do that?

Postgres copy data with \xYY as plain string instead of interpreting as encoded string

Posted: 11 Mar 2013 09:24 PM PDT

I have a log file full of URLs, generated by Bro IDS.

When Bro logs a URL with non-ascii characters in it, it inserts \xYY where YY is the hexadecimal character code. Also, some URLs contain "\x".

Is there a setting or flag I can use with the COPY or \copy command to stop postgres from trying to interpret these string sequences and just put them in the text field as is?

How to make Greenplum 4.2.3 only scan the intended partition?

Posted: 11 Mar 2013 07:53 PM PDT

When I uses unnest() in a View, and uses that View in a select statement, Greenplum seems to fail in only searching for the intended partition and search through all the partition of the main table instead. The same thing also apply when using a Subquery instead of a View.

For example:

Select something from (select unnest(an_array_field) from table_y)   where date = 'somedate' and state = 'somestate'  

We currently have 2 different servers running 2 different version of Greenplum. Server A run on an older version (4.2.1) while the Server B run on 4.2.3. Running the same query above will result differently. Server A (old) will return the query in few seconds while Server B (new) will take forever to return. Running an Explain of the query shows that Server A only do scan on one of the partitions (with the date and state in the where clause) while Server B will do a Scan on each partition, causing the slowness.

The table structure for both DBs are the same. Running a query without the unnest will not have the problem. So, I'm suspecting that there is something to do with new version. Is there anything I can do to solve this problem?

In SSIS 2012, what is the purpose of flagging a package as an "Entry Point" package

Posted: 11 Mar 2013 03:26 PM PDT

In the Visual Studio designer you can right click on a SSIS package and designate it as an "Entry Point" package"

Screen shot of Visual Studio designer

Doing a search I found this page on MSDN which states:

The value of 1 signifies that the package is meant to be started directly. The value of 0 signifies that the package is meant to be started by another package with the Execute Package task. The default value is 1.

With this flag enabled and disabled I have been able to execute a package directly.

What is the purpose of enabling or disabling this flag? Is it merely to document the intentions of your own SSIS packages or does SQL Server/SSIS behave differently when it enabled or disabled?

Time as a measure

Posted: 11 Mar 2013 06:51 PM PDT

Is it possible to have times as measures in a cube?

We're trying to view employee start times by day, aggregating as an average over time, but even with a No Aggregation measure type I'm getting an error when deploying saying that StartTime is a String value.

Is this at all possible? It doesn't seem like such a crazy thing to want to do...

Blocking incoming database links

Posted: 11 Mar 2013 02:03 PM PDT

Let's say there is an Oracle database A and another Oracle database B. Our database is A.

Is there a way to block database links from B to A?

Stop SQL Server service(s) before defragmenting drive?

Posted: 11 Mar 2013 02:32 PM PDT

Our production SQL Server 2005 database's data files live on a separate physical drive, which Microsoft Windows 2003's Disk Defragmenter tool reports as 99% fragmented.

We scheduled a task to defragment this drive at 3:00 a.m. on a Saturday morning. The job completed after 40 minutes with no apparent errors. However, the drive remains heavily fragmented.

Should we have stopped SQL Server service(s) before defragmenting?

CONTEXT

Per requests for context: We have a Microsoft SQL Server 2005 instance (9.00.5324.00) running 32-bit Windows Server 2003 (SP2) on Dell PowerEdge 2950 hardware, circa 2007, with 4GB RAM. The PowerEdge 2950 has four 68GB drives configured as RAID-1 to create two 68GB virtual disks: (1) C (boot and OS) & D (pagefile, miscellaneous other data); and (2) E (SQL data). To my knowledge, IT staff have never defragmented any of these drives...Disk Defragmenter reports file fragmentation of 66% (C), 77% (D), and 99% (E). Performance Monitor reports the following average results: "Paging file: % usage" = ~6.8%; "SQL Server: Buffer Manager - Page life expectancy" = 20 seconds; and "PhysicalDisk: Avg. disk sec/write, drive E" = between 300 and 1,100 ms. We're due for a much-needed hardware and SQL Server upgrade in a few months time (viz., new hardware, 64-bit Windows Server 2012, 64-bit SQL Server 2012, 12GB RAM), but, due to end-user performance, want to alleviate the issue as much as possible. Thus the thinking a file defrag might help for drive E, the main SQL data drive.

As an aside, last week we pulled two failed drives and rebuilt the array...not sure that matters. We contract with another IT team to maintain the server, so we do not have direct access to the equipment...our organization just pays for services.

We can afford the downtime during regularly scheduled maintenance windows (weekly) as well as out-of-band downtime, as necessary, overnight.

Representing SQL constraints on a table

Posted: 11 Mar 2013 08:21 PM PDT

I have this table:

CREATE  TABLE `sold` (  `part_ID` INT(11) NOT NULL ,  `date` DATE NOT NULL ,   PRIMARY KEY (`part_ID`, `date`) ,   FOREIGN KEY (`part_ID` )   REFERENCES `part` (`part_ID` );  

This table represent parts sold each day, constraint says number of sales should be at least 25 and at most 100.

I think it should start with something like this:

CHECK ( NOT EXISTS ...  

How to read this output from MySQL?

Posted: 11 Mar 2013 01:39 PM PDT

Query that is being ran:

INSERT INTO `log_url` (`url_id`, `visitor_id`, `visit_time`) VALUES (?, ?, ?)  

Error that is being thrown:

  SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '41721' for key 1  

Where does 41721 fall into? How do I read this? How do I check the 'key 1'?

Use single table with foreign key or two twin tables for text entries?

Posted: 11 Mar 2013 06:38 PM PDT

I have a table answers that has several columns including the content iself such as

ID | q_id | list_id | user_module_id | content | updated_timestamp  

The content column is a varchar since we want to be able to search through the answer content quickly.

However, we also want to be able to store full text as answers. There are two approaches:

A. Using two "twin" table (with the same columns names) : a table with the column content of type varchar and a table with the column content of type text.

B. Using a single answer table and storing in the content column a hash key such as q_id7user_module_id362 and use another table that contains only the text entry:

hashkey | content  

This approach seems cleaner, however it means doing two UPDATEs AND INSERTs for each entry modification and a JOIN on the two table to pull out the information.

In general is this bad practice to have "twin" tables in a database?

Thanks!

What index to add in case of join by two optional fields

Posted: 11 Mar 2013 01:09 PM PDT

I have query similar to the one below, it joins two tables by field which can have NULL value. The result matchs the data if both tables have the same data or consider the condition as optional.

SELECT B.X, A.Y FROM tableA A  INNER JOIN tableB B ON        ISNULL(B.level5, A.f05) = A.f05       AND ISNULL(B.level6, A.f06) = A.f06  

The best index for table B I could think of is such which

CREATE NONCLUSTERED INDEX [Index_On_Table_B] ON B  ( X ASC ) INCLUDE (level5, level6)  

It helps a bit and from execution plan I can see the index is used but 90 % of the cost is spent in Nested Loops (Inner join).

Is there any way how to get this query working faster?

Granting permissions only on a set list of objects

Posted: 11 Mar 2013 06:36 PM PDT

I have a SQL Server 2005 database with a large number of tables in the dbo schema. I now created a new schema (call it myschema) that only has three table-valued functions and two stored procedures in it. All of that code has access to the tables in dbo.

The code in myschema will ultimately be called from a web service and I am struggling to get the permissions right for the user I created for the web service.

At first, I created the user with no roles except public and then gave it specific permissions on the securables in myschema. But then I could log on using that user and select from (and ever update) anything in dbo.

So I gave the user the denydatareader and denydatawriter roles, which effectively restricted the access to the objects in dbo.

The result of this is that I can execute the two stored procedures just fine.

But if I try to use the table-valued functions, I get this error:

The SELECT permission was denied on the object 'MyFunction', database 'MyDB', schema 'myschema'.

This is despite my use of:

grant select on myschema.MyFunction to MyUser  

I'm guessing that's because of my brilliant use of denydatareader.

So what is the correct way to give a user access only to a list of specific stored procedures and table-valued functions and not to anything else?

What fillfactor for caching table?

Posted: 11 Mar 2013 07:31 PM PDT

I have heavily updated / accessed table where I store serialized java objects. They are in the table for 2-3 hours (also are being updated during that period) and then removed. Size of table is around 300MB. I have spotted it is very, very often VACUUMed and wonder if changing the fillfactor would help?

Cannot connect from the DMZ to a named instance of SQL Server

Posted: 11 Mar 2013 04:31 PM PDT

I have a problem with my SQL Server.

  • On my SQL Server machine, there are two instances. I can connect to the default instance from the DMZ.
  • When I connect to the named instance DBSERVER\NAMEDINSTANCE, the connection will simply timeout.

If someone can help me, then I'll be happy :)

Thanks.

PostgreSQL Sequential Scan instead of Index Scan Why?

Posted: 11 Mar 2013 04:05 PM PDT

Hi All I've got a problem with my PostgreSQL database query and wondering if anyone can help. In some scenarios my query seems to ignore the index that I've created which is used for joining the two tables data and data_area. When this happens it uses a sequential scan and results in a much slower query.

Sequential Scan (~5 minutes)

Unique  (cost=15368261.82..15369053.96 rows=200 width=1942) (actual time=301266.832..301346.936 rows=153812 loops=1)     CTE data       ->  Bitmap Heap Scan on data  (cost=6086.77..610089.54 rows=321976 width=297) (actual time=26.286..197.625 rows=335130 loops=1)             Recheck Cond: (datasetid = 1)             Filter: ((readingdatetime >= '1920-01-01 00:00:00'::timestamp without time zone) AND (readingdatetime <= '2013-03-11 00:00:00'::timestamp without time zone) AND (depth >= 0::double precision) AND (depth <= 99999::double precision))             ->  Bitmap Index Scan on data_datasetid_index  (cost=0.00..6006.27 rows=324789 width=0) (actual time=25.462..25.462 rows=335130 loops=1)                   Index Cond: (datasetid = 1)     ->  Sort  (cost=15368261.82..15368657.89 rows=158427 width=1942) (actual time=301266.829..301287.110 rows=155194 loops=1)           Sort Key: data.id           Sort Method: quicksort  Memory: 81999kB           ->  Hash Left Join  (cost=15174943.29..15354578.91 rows=158427 width=1942) (actual time=300068.588..301052.832 rows=155194 loops=1)                 Hash Cond: (data_area.area_id = area.id)                 ->  Hash Join  (cost=15174792.93..15351854.12 rows=158427 width=684) (actual time=300066.288..300971.644 rows=155194 loops=1)                       Hash Cond: (data.id = data_area.data_id)                       ->  CTE Scan on data  (cost=0.00..6439.52 rows=321976 width=676) (actual time=26.290..313.842 rows=335130 loops=1)                       ->  Hash  (cost=14857017.62..14857017.62 rows=25422025 width=8) (actual time=300028.260..300028.260 rows=26709939 loops=1)                             Buckets: 4194304  Batches: 1  Memory Usage: 1043357kB                             ->  Seq Scan on data_area  (cost=0.00..14857017.62 rows=25422025 width=8) (actual time=182921.056..291687.996 rows=26709939 loops=1)                                   Filter: (area_id = ANY ('{28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11}'::integer[]))                 ->  Hash  (cost=108.49..108.49 rows=3349 width=1258) (actual time=2.256..2.256 rows=3349 loops=1)                       Buckets: 1024  Batches: 1  Memory Usage: 584kB                       ->  Seq Scan on area  (cost=0.00..108.49 rows=3349 width=1258) (actual time=0.007..0.666 rows=3349 loops=1)   Total runtime: 301493.379 ms  

Index Scan (~3 seconds) (on explain.depesz.com)

Unique  (cost=17352256.47..17353067.50 rows=200 width=1942) (actual time=3603.303..3681.619 rows=153812 loops=1)     CTE data       ->  Bitmap Heap Scan on data  (cost=6284.60..619979.56 rows=332340 width=297) (actual time=26.201..262.314 rows=335130 loops=1)             Recheck Cond: (datasetid = 1)             Filter: ((readingdatetime >= '1920-01-01 00:00:00'::timestamp without time zone) AND (readingdatetime <= '2013-03-11 00:00:00'::timestamp without time zone) AND (depth >= 0::double precision) AND (depth <= 99999::double precision))             ->  Bitmap Index Scan on data_datasetid_index  (cost=0.00..6201.51 rows=335354 width=0) (actual time=25.381..25.381 rows=335130 loops=1)                   Index Cond: (datasetid = 1)     ->  Sort  (cost=17352256.47..17352661.98 rows=162206 width=1942) (actual time=3603.302..3623.113 rows=155194 loops=1)           Sort Key: data.id           Sort Method: quicksort  Memory: 81999kB           ->  Hash Left Join  (cost=1296.08..17338219.59 rows=162206 width=1942) (actual time=29.980..3375.921 rows=155194 loops=1)                 Hash Cond: (data_area.area_id = area.id)                 ->  Nested Loop  (cost=0.00..17334287.66 rows=162206 width=684) (actual time=26.903..3268.674 rows=155194 loops=1)                       ->  CTE Scan on data  (cost=0.00..6646.80 rows=332340 width=676) (actual time=26.205..421.858 rows=335130 loops=1)                       ->  Index Scan using data_area_pkey on data_area  (cost=0.00..52.13 rows=1 width=8) (actual time=0.006..0.008 rows=0 loops=335130)                             Index Cond: (data_id = data.id)                             Filter: (area_id = ANY ('{28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11}'::integer[]))                 ->  Hash  (cost=1254.22..1254.22 rows=3349 width=1258) (actual time=3.057..3.057 rows=3349 loops=1)                       Buckets: 1024  Batches: 1  Memory Usage: 584kB                       ->  Index Scan using area_primary_key on area  (cost=0.00..1254.22 rows=3349 width=1258) (actual time=0.012..1.429 rows=3349 loops=1)   Total runtime: 3706.630 ms  

Table Structure

This is the table structure for the data_area table. I can provide the other tables if need be.

CREATE TABLE data_area  (    data_id integer NOT NULL,    area_id integer NOT NULL,    CONSTRAINT data_area_pkey PRIMARY KEY (data_id , area_id ),    CONSTRAINT data_area_area_id_fk FOREIGN KEY (area_id)        REFERENCES area (id) MATCH SIMPLE        ON UPDATE NO ACTION ON DELETE NO ACTION,    CONSTRAINT data_area_data_id_fk FOREIGN KEY (data_id)        REFERENCES data (id) MATCH SIMPLE        ON UPDATE CASCADE ON DELETE CASCADE  );  

QUERY

WITH data AS (      SELECT *       FROM data       WHERE           datasetid IN (1)           AND (readingdatetime BETWEEN '1920-01-01' AND '2013-03-11')           AND depth BETWEEN 0 AND 99999  )  SELECT *   FROM (       SELECT DISTINCT ON (data.id) data.id, *       FROM           data,           data_area           LEFT JOIN area ON area_id = area.id       WHERE           data_id = data.id           AND area_id IN (28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11)   ) as s;  

Returns 153812 rows. Did set enable_seqscan= false; to disable sequential scan and get the index result.

I've tried doing an ANALYSE on the database and increasing the statistics gathered on the columns used in the query, but nothing seems to help.

Could anyone spread and light on this or suggest anything else I should try?

Oracle 11g listener fails with ORA-12514 and ORA-12505 errors

Posted: 11 Mar 2013 08:41 PM PDT

I run an instance of Oracle 11g locally on my development machine and can connect to the local instance directly via SqlPlus:

c:\>sqlplus ace    SQL*Plus: Release 11.2.0.2.0 Production on Mon Mar 11 11:50:20 2013    Copyright (c) 1982, 2010, Oracle.  All rights reserved.    Enter password:    Connected to:  Oracle Database 11g Express Edition Release 11.2.0.2.0 - Beta    SQL> select count(*) from my_table ;      COUNT(*)  ----------        5297  

But I cannot connect to it via the listener:

c:\>sqlplus -L "user/pw@(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XE)))"    SQL*Plus: Release 11.2.0.2.0 Production on Mon Mar 11 11:52:40 2013    Copyright (c) 1982, 2010, Oracle.  All rights reserved.    ERROR:  ORA-12514: TNS:listener does not currently know of service requested in connect  descriptor      SP2-0751: Unable to connect to Oracle.  Exiting SQL*Plus  

Similarly, if I connect via SqlDeveloper I get an error (albeit ORA-12505, TNS:listener does not currently know of SID given in connect descriptor).

This instance has been stable and working fine for a year or more until today, a Monday morning. Our corporate IT do sometimes push new policies and updates over the weekend, so I'm assuming that something has changed, but I've not been able to work out what.

I've restarted the service and the listener several times, the listener log doesn't give any clues.

The listener seems fine:

c:\>lsnrctl status    LSNRCTL for 32-bit Windows: Version 11.2.0.2.0 - Beta on 11-MAR-2013 11:55:33    Copyright (c) 1991, 2010, Oracle.  All rights reserved.    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))  STATUS of the LISTENER  ------------------------  Alias                     LISTENER  Version                   TNSLSNR for 32-bit Windows: Version 11.2.0.2.0 - Beta  Start Date                11-MAR-2013 11:17:30  Uptime                    0 days 0 hr. 38 min. 3 sec  Trace Level               off  Security                  ON: Local OS Authentication  SNMP                      OFF  Default Service           XE  Listener Parameter File   C:\oraclexe\app\oracle\product\11.2.0\server\network\admin\listener.ora  Listener Log File         C:\oraclexe\app\oracle\diag\tnslsnr\FBC305BB46560\listener\alert\log.xml  Listening Endpoints Summary...    (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\EXTPROC1ipc)))    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=machine.domain.com)(PORT=1521)))  Services Summary...  Service "CLRExtProc" has 1 instance(s).    Instance "CLRExtProc", status UNKNOWN, has 1 handler(s) for this service...  Service "PLSExtProc" has 1 instance(s).    Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...  The command completed successfully  

Port 1521 seems ok:

c:\>netstat -an -O | find /i "1521"    TCP    0.0.0.0:1521           0.0.0.0:0              LISTENING       4368    TCP    169.243.90.109:55307   159.185.207.100:1521   ESTABLISHED     12416    TCP    [::]:1521              [::]:0                 LISTENING       4368  

(PID 4368 is TNSLSNR.exe process.)

Also, I can tnsping to the XE service:

c:\>tnsping xe    TNS Ping Utility for 32-bit Windows: Version 11.2.0.2.0 - Beta on 11-MAR-2013 12:27:47    Copyright (c) 1997, 2010, Oracle.  All rights reserved.    Used parameter files:  C:\oraclexe\app\oracle\product\11.2.0\server\network\admin\sqlnet.ora      Used TNSNAMES adapter to resolve the alias  Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = machine.domain.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XE)))  OK (210 msec)  

The listenerr.ora file:

SID_LIST_LISTENER =    (SID_LIST =      (SID_DESC =        (SID_NAME = PLSExtProc)        (ORACLE_HOME = C:\oraclexe\app\oracle\product\11.2.0\server)        (PROGRAM = extproc)      )      (SID_DESC =        (SID_NAME = CLRExtProc)        (ORACLE_HOME = C:\oraclexe\app\oracle\product\11.2.0\server)        (PROGRAM = extproc)      )    )    LISTENER =    (DESCRIPTION_LIST =      (DESCRIPTION =        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1))        (ADDRESS = (PROTOCOL = TCP)(HOST = machine.domain.com)(PORT = 1521))      )    )    DEFAULT_SERVICE_LISTENER = (XE)  

Additionally, and I've no idea if it is related, I can't seem to access apex on https://127.0.0.1:8080/apex (even though the permissions for that seem fine).

So where else should I be looking?

Update with requested information:

SQL> show parameter service_names    NAME                                 TYPE        VALUE  ------------------------------------ ----------- ------------------------------  service_names                        string      XE  SQL> show parameter local_listener    NAME                                 TYPE        VALUE  ------------------------------------ ----------- ------------------------------  local_listener                       string  

What is the best way to store X509 certificate in PostgreSQL database?

Posted: 11 Mar 2013 07:06 PM PDT

I'm working on web authenticate system where users will digitally sign random tokens and this will be checked with X509 certificates stored on the server.

Therefore I have to store several X509 certificates (PEM or DER format) in PostgreSQL database. Sounds easy, but I want to have possibility to search certificates with subject, issuer, notBefore, notAfter and similar criteria.

My idea is to have following columns in database: X509data, notAfter, notBefore, subject, issuer etc. Than I will create object (in SQL alchemy) representing X509 certificate with methods like add_new_X509(), find_X509(search criteria) etc. So whenever I will add new certificate with method add_new_X509() it will automatically reads from certificate all data and fill up rest of the columns and put raw certificate into X509data column.

Unfortunately this solution have two disadvantages:

  1. I will store the same information twice (in X509 certificate itself and in separate columns for easy searching)
  2. Whenever I want to read X509 certificate, my application will have to crosscheck notAfter, notBefore, subject, issuer with information stored in original certificate (this is for security reasons, in case someone would try to modify this fields).

So.. anybody have better idea, or suggestion? Maybe somebody see any other security issue that can arise with this solution?

Most efficient way to sort data fields into SQL

Posted: 11 Mar 2013 03:43 PM PDT

I'm trying to decide on the most efficient way to sort various data values. Here's how the data arrives: Device X sends a text string "name=value&name2=value&name=value"

On arrival that string is stuffed into a sql row along with the unique address of the sending device. This keeps data flowing in easily to a SQLite database.

My parsing script first gets all unique device addresses. Those are put in a hash for the parser and inserted into a new database. (the hash contains the rowid from the db after the insert.) (with more logic to keep race conditions out of the mix)

Then each row of string data is split up by the &, then by the =.

Here's the general table layout:

rawData(address, string, timestamp, processed)  

Each row is read from the rawData, sorted and marked as processed. (makes it easy to muck around with the parsing script)

Data is placed into these:

devices(rowid, address)  dataNames(rowid, devicesid, name)  dataValues(nameId, value, timestamp)  

I'm trying to decide on the most efficient method of inserting this data. It ends up as thousands of rows. I supposed I could keep a hash of known device/name pairs. Then if the hash doesn't know about a new name I can go ahead and insert it and refresh the hash...

Am I missing something totally obvious? The goal is to keep selects to a minimum for efficiency!

How should I tune Postgresql for 20 GB of RAM?

Posted: 11 Mar 2013 03:44 PM PDT

I've been fortunate enough to have the use of a 20GB Linode instance running Ubuntu 64 bit. I want to try to optimize PostGres for this service, but I don't know what I should prioritize changing.

I have several datasets of 20,000 or so rows and the calculations that are being performed are memory intensive queries (spatial analyses) with a small number of rows being written after each request. The total number of users is very small (10 - 50).

I've read through this article on the Postgresql site but I don't know enough about how this works, to know what I should prioritize. I've also looked at advice on what to change for geo type work here.

For example, I tried changing the shared_buffers to 200MB (which is much less than 75% of 20GB). This resulted in the following error message:

  • Restarting PostgreSQL 9.1 database server
    • The PostgreSQL server failed to start. Please check the log output: 2013-03-10 12:21:58 EDT FATAL: could not create shared memory segment: Invalid argument 2013-03-10 12:21:58 EDT DETAIL: Failed system call was shmget(key=5432001, size=47742976, 03600). 2013-03-10 12:21:58 EDT HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 47742976 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections. If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for. The PostgreSQL documentation contains more information about shared memory configuration.

I returned this to it's original value and tried changing:

work_mem = 50MB  maintenance_work_mem = 256MB   

My problem is that I don't know which values I should try changing, or how I should prioritize which values are key to experiment with and test. What should I specifically do to optimize this database?

Efficient way to move rows across the tables?

Posted: 11 Mar 2013 07:10 PM PDT

This is somewhat long question as I would like to explain all details of the problem.

System Description

We have a queue of incoming messages from external system(s). Messages are immediately stored in the e.g. INBOX table. Few thread workers fetch the job chunk from the table (first mark some messages with UPDATE, then SELECT marked messages). Workers do not process the messages, they dispatch them to different internal components (called 'processors'), depending on message command.

Each message contains several text fields (longest is like 200 varchars), few ids and some timestamp(s) etc; 10-15 columns total.

Each internal component (i.e. processor) that process messages works differently. Some process the message immediately, others triggers some long operation, even communicating via HTTP with other parts of the system. In other words, we can not just process message from the INBOX and then remove it. We must work with that message for a while (async task).

Still, there are not too many processors in the system, up to 10.

Messages are all internal, i.e. it is not important for user to browse them, paginate etc. User may require list of processed relevant messages, but that's not mission-critical feature, so it does not have to be fast. Some invalid message may be deleted sometimes.

Its important to emphasize that expected traffic might be quite high - and we don't want bottlenecks because of bad database design. Database is MySql.

Decision

The one of the decisions is not to have one big table for all messages, with some flags column that will indicate various messages states. Idea is to have tables per processors; and to move messages around. For example, received messages will be stored in INBOX, then moved by dispatcher to some e.g. PROCESSOR_1 table, and finally moved to ARCHIVE table. There should not be more then 2 such movements. W

While in processing state, we do allow to use flags for indicating processing-specific states, if any. In other words, PROCESSOR_X table may track the state of the messages; since the number of currently processing messages will be significantly smaller.

The reason for this is not to use one BIG table for everything.

Question

Since we are moving messages around, I wonder how expensive this is with high volumes. Which of the following scenarios is better:

(A) to have all separate similar tables, like explained, and move complete messages rows, e.g. read complete row from INBOX, write to PROCESSOR table (with some additional columns), delete from INBOX.

or

(B) to prevent physical movement of the content, how about to have one big MESSAGES table that just stores the content (and still not the state). We would still have other tables, as explained above, but they would contain just IDs to messages and additional columns. So now, when message is about to move, we physically move much less data - just IDs. The rest of the message remains in the MESSAGE table unmodified all the time.

In other words, is there a penalty in sql join between one smaller and one huge table?

Thank you for your patience, hope I was clear enough.

Write differences between varchar and nvarchar

Posted: 11 Mar 2013 02:19 PM PDT

Currently in our SQL Server 2012 database, we're using varchar, and we'd like to change that nvarchar. I've generated a script to do that.

My question is are there any differences in how SQL Server writes to varchar columns vs. nvarchar columns? We have a number of backend procedures that I'm concerned about.

Edit:
Not sure if this helps, but the columns don't have indexes, f/k, or constraints on them.

How to properly kill MySQL?

Posted: 11 Mar 2013 09:06 PM PDT

I have CentOS 64bit with CPanel installed and I use

service mysql stop  

However, it keeps doing ..... for minutes and it never stops. It used to be instant. Any idea why it does that and how to fix? Right now I have to do killall -9 mysql but is there a better way?

The server is also very very active.

Is this a config issue? Do I have memroy settings too high?

[mysqld]  default-storage-engine=MyISAM  local-infile=0  symbolic-links=0  skip-networking  max_connections = 500  max_user_connections = 20  key_buffer = 512M  myisam_sort_buffer_size = 64M  join_buffer_size = 64M  read_buffer_size = 12M  sort_buffer_size = 12M  read_rnd_buffer_size = 12M  table_cache = 2048  thread_cache_size = 16K  wait_timeout = 30  connect_timeout = 15  tmp_table_size = 64M  max_heap_table_size = 64M  max_allowed_packet = 64M  max_connect_errors = 10  query_cache_limit = 1M  query_cache_size = 64M  query_cache_type = 1  low_priority_updates=1  concurrent_insert=ALWAYS  log-error=/var/log/mysql/error.log  tmpdir=/home/mysqltmp  myisam_repair_threads=4  [mysqld_safe]  open_files_limit = 8192  log-error=/var/log/mysql/error.log    [mysqldump]  quick  max_allowed_packet = 512M    [myisamchk]  key_buffer = 64M  sort_buffer = 64M  read_buffer = 16M  write_buffer = 16M  

Why would mysql "show global status" query be taking 15 minutes?

Posted: 11 Mar 2013 03:36 PM PDT

I'm reviewing the slow log, and on one of my slaves the average time for SHOW GLOBAL STATUS is 914s.

Any idea how to determine the cause of this?

Is there a way to export Oracle's UNDO?

Posted: 11 Mar 2013 04:36 PM PDT

I tried exp utility to dump all database. Looks like this exports only the last version of data skipping undo log. Using flashback queries I see:

01466. 00000 -  "unable to read data - table definition has changed"  *Cause:    Query parsed after tbl (or index) change, and executed             w/old snapshot  

What I'm trying to do is to capture db changes, make backup for later use with the ability to flashback to timestamp.

With rman backup I have similar situation:

ORA-01555: snapshot too old: rollback segment number 3 with name "_SYSSMU3_2472002983$"  too small   01555. 00000 -  "snapshot too old: rollback segment number %s with name \"%s\" too small"  *Cause:    rollback records needed by a reader for consistent read are overwritten by other writers`.   

Update: I managed to do what I needed only by increasing undo retention and direct copying of data files and control file modification on cloned instance.

MySQL-5.5 InnoDB memory issue

Posted: 11 Mar 2013 07:37 PM PDT

Version in use is mysql-5.5.24.

In the Enterprise version of MySQL innodb_buffer_pool space is released after a while, however this does not appear to be happening in the community version.

I am not seeing free space in innodb_buffer_pool even after all the apps are shut down.

Is there such a difference between the Enterprise and Community versions of MySQL?

MySQL table relations, inheritance or not?

Posted: 11 Mar 2013 02:36 PM PDT

Im building a micro CMS. Using Mysql as RDMS, and Doctrine ORM for mapping.

I would like to have two types of pages. Static Page, and Blog Page. Static page would have page_url, and page_content stored in database. Blog page would have page_url, but no page_content. Blog would have Posts, Categories...

Lets say I have route like this:

/{pageurl}  

This is page, with page url that can be home, or news, or blog... That page can be either Static page, and then I would joust print page_content. But it can also be Blog Page, and then I would print latest posts as content.

How should I relate these Static Page and Blog Page tables? Is this inheritance, since both are pages, with their URL, but they have different content? Should I use inheritance, so that both Static and Blog page extends Page that would have page_url? Or should I made another table page_types and there store information about available page types?

When should I think about upgrading our RDS MySQL instance based on memory usage?

Posted: 11 Mar 2013 01:36 PM PDT

It seems like our DB server is doing garbage collection at a increasingly faster rate, which seem normal since it's growing. What's a good rule of thumb of when to switch to a bigger instance, I'm not a DBA and have no frame of reference. It seems to be doing garbage collection once every 2-3 days now whenever there's only 100mb left.

enter image description here

The server itself has 1.7GB of RAM.

No comments:

Post a Comment

Search This Blog