Wednesday, August 14, 2013

[how to] Using a PostgreSQL tablespace from two different servers

[how to] Using a PostgreSQL tablespace from two different servers


Using a PostgreSQL tablespace from two different servers

Posted: 14 Aug 2013 08:27 PM PDT

I have a PostgreSQL database living on an external hard disk (RAID0 via Thunderbolt). So far I accessed it from a PostgreSQL server running on my MacBook Pro. As my work on this database is getting more and more intensive the queries are getting more and more complex, too. Therefore I'd like to use my brand spanking new iMac from now on.

My question is: Can I somehow tell the new PostgreSQL server (running on the iMac) to use the data that is already living in the tablespace on the external disk? Or will I have to export & import all the data manually (pg_dump, pg_restore)?

Better way to iterating through tables with foreign key

Posted: 14 Aug 2013 08:13 PM PDT

I have a master table (M), and 2 slave tables (S1, S2). S1 and S2 references M through one-to-many foreign keys (F1, F2). Now, I'm writing a PHP (or whatever language) function, to get some records from M, along with information stored in S1 and S2. Basically, there are 2 ways to do it:

  1. Execute a JOIN'ed SELECT statement that returns content of all 3 tables (though creating VIEW will be better, but it's not the point here);
  2. Execute SELECT on M first, then query S1 and S2 for every record in M, in a loop.

This may sound idiot, but I'm thinking about which is the better approach. The concerns are:

  1. If M contains a lot of data, using JOIN will return a whole lot of duplicate data in M, since every record in S1 and S2 will return multiple M records in the output result set;
  2. Running subsequent queries on S1 and S2 incurs significant overhead to process the queries. Although stuffs like query cache and prepared statements improve the performance, but I not sure if it's enough.

How to create a trigger in a different database?

Posted: 14 Aug 2013 07:46 PM PDT

Is it possible to create a stored procedure that creates a table trigger (DDL) in a different database than the one the stored procedure itself resides. The databases are on the same server instance. If yes then how?

For example this does not work:

create PROCEDURE [dbo].[CreateTriggrer]    @db varchar(60)  AS  BEGIN    SET NOCOUNT ON;       declare @statement nvarchar(max) = N'CREATE TRIGGER [dbo].[TestTrigger]     ON  [$database$].[dbo].[TestTable]      AFTER INSERT, UPDATE, DELETE  AS   BEGIN    PRINT ''test''  END'        set @statement = REPLACE(@statement,'$database$',@db)      EXEC dbo.sp_executesql @statement = @statement    END  

When called like this:

EXEC [dbo].[CreateTriggrer] @db = N'TestDatabase'  

It returns this error:

Msg 2108, Level 15, State 1, Procedure TestTrigger, Line 6 Cannot create trigger on 'TestDatabase.dbo.TestTable' as the target is not in the current database.

Which is fair enough. Is there way to achieve what I want?

Oracle: list user-created tables in the SYS schema

Posted: 14 Aug 2013 05:46 PM PDT

I need to delete all tables (hundreds) in the SYS schema that someone accidentally created with sqlplus. Looking in dba_tables and dba_objects, I don't see a way to distinguish these tables from pre-existing stock system tables. Any help appreciated.

SQL Server - compare pictures in db as varbinary [on hold]

Posted: 14 Aug 2013 08:00 PM PDT

is possible create stored procedure which will returned similarity ratio of images saved in db as varbinary ? Input parameters could be image id and output near ratio.

This is my db structure

CREATE TABLE [dbo].[Picture]  (      [Id] [int] IDENTITY(1,1) NOT NULL,      [PictureBinary] [varbinary](max) NULL,      [MimeType] [nvarchar](40) NULL,      [IsNew] [bit] NULL,      [SeoFilename] [nvarchar](300) NULL,  )  

What happens if the mysql database's innodb log files are lost?

Posted: 14 Aug 2013 07:51 PM PDT

What I did was

/etc/init.d/mysql stop  

then removed the files: ib_logfile0, ib_logfile1

then modified my.cnf file, variable: innodb_log_file_size

and then:

/etc/init.d/mysql start  

and allowed the files to be recreated

I later discovered that the global variable innodb_fast_shutdown is set to "1"

The question is, how much data was lost ?

Note: I still have the old files ib_logfile0, ib_logfile1, not deleted yet.

And the website relying on the database appears to be working.

Connecting to an external database with pgAdmin III

Posted: 14 Aug 2013 03:40 PM PDT

I'm trying to connect to an external database from pgAdmin III (which is installed on both machines). The client complains:

Access to database denied

The server doesn't grant access to the database: the server reports

FATAL: no pg_hba.conf entry for host "192.168.108.161",                                  user "postgres",                              database "postgres",                                   SSL off   

To access a database on a PostgreSQL server, you first have to grant primary access to the server for your client (Host Based Authentication). PostgreSQL will check the pg_hba.conf file if a pattern that matches your client address / username / database is present and enabled before any SQL GRANT access control lists are evaluated.

The initial settings in pg_hba.conf are quite restrictive, in order to avoid unwanted security holes caused by unreviewed but mandatory system settings. You'll probably want to add something like

host all all 192.168.0.0/24 md5   

This example grants MD5 encrypted password access to all databases to all users on the private network 192.168.0.0/24. You can use the pg_hba.conf editor that is built into pgAdmin III to edit the pg_hba.conf configuration file.

After changing pg_hba.conf, you need to trigger a server configuration reload using pg_ctl or by stopping and restarting the server process.

Where the server explicitly states that all connections from the internal network are accepted.

pg_hba.conf

# TYPE  DATABASE        USER            ADDRESS                 METHOD  # "local" is for Unix domain socket connections only  local   all             all                                     md5  host    all             all             127.0.0.1/32            md5  host    all             all             ::1/128                 md5  host    all             all             192.168.0.0/24          md5  

I have already restarted the postmaster for the changes to take effect, and have gone so far as to restart the entire machine. What else could be going wrong here?

Extract time portion of Datetime2(7) in hhmmssfff format

Posted: 14 Aug 2013 12:38 PM PDT

I need to extract the time portion of a datetime2(7) column in hhmmssfff format and I am doing it like this:

DECLARE @mdate AS DATETIME2(7)     SET @mdate = '2012-03-15 10:13:27.5437431'    SELECT LEFT(REPLACE(REPLACE(CONVERT(VARCHAR ,@mdate ,114),':',''),'.',''),9)  

Is there a better approach than doing this ugly replaces/convert/left?

I need this to join to a DimTime dimension whose key is in hhmmssfff format.

Connect to SQL Server 2012 running on Azure VM via local SSMS 2012

Posted: 14 Aug 2013 01:06 PM PDT

I have been testing VM on Azure. I have created a SQL VM running SQL 2012 on Windows 2012 and would like to connect to it via SSMS 2012 on my local instead of connecting via RDP through Azure Portal.

Thanks!

Retrieve list of matched words in PostgreSQL

Posted: 14 Aug 2013 12:44 PM PDT

I'm new to PostgreSQL and really more than your more basic queries/inserts.

I've created a TSVector column to my table that I'm searching and have set the column to look at the specific column in the table.

Now what I'm trying to do is find out what words matched in a query.

If I had the query

SELECT ts_headline('the big red ball' , 'big & ball');  

It would obviously return:

"the big red ball"

Is there a way I can get it to return a list instead so I just see big and red in seperate rows?

Table redirect / filter / trigger on select

Posted: 14 Aug 2013 10:43 AM PDT

Is there any way to redirect queries to different tables / views based on the package that's referencing the table? I.e. packages A and B both have "select grade from schema1.grd_tbl", but I want package A to get the percent grade that's stored in the table, and package B to get a letter grade that's calculated from the percent. I'd like to avoid modifying the (dozens of) packages that reference the table. I'd rather 'spoof' the table somehow if we can, replacing the percent in the grade column with a letter when called from those packages. (The column's varchar2; percents are coded as characters.)

First thought was to create a synonym pointing to a view that massages the column based on the calling package, but the code fully qualifies the table name in most cases so that doesn't seem doable. I went looking for something equivalent to a trigger on select; the closest I've found is fine-grained audit, and it's not row-based.

Is there magic somewhere that might let me do this? Any hints appreciated.

Perry.

View with fallback (performance/optimization question)

Posted: 14 Aug 2013 03:44 PM PDT

I have a table with tariffs for stuff; the tariffs table is not important in this scenario, the "tariff values" are. In this Demonstration SQL Fiddle the tariff_plan is the FK to the tariffplans table (not included in the example). The tariff for each "thing" is the tariff_type (simplified to a simple char for demonstration purposes).

I have, for example, a default tariffplan (key = default); this is the tariff that goes for each customer unless another value is defined for the same tariff_type for that customer. A customer is assigned a tariffplan (key = plan_x in my example).

If have tariffs defined for items a, b, c and d in the default plan. In plan_x I define "override" values for a and c.

So, what I do is I select the default plan (alias p below for primary) and left-join the "override" plan (plan_x) to it (alias s below for secondary):

select *  from tariff_values as p  left outer join tariff_values s       on (p.tariff_type = s.tariff_type) and (s.tariff_plan = 'plan_x')  where (p.tariff_plan = 'default')  

This results, as expected, in:

id   tariff_plan tariff_type tariff_foo tariff_bar id   tariff_plan tariff_type tariff_foo tariff_bar  ---- ----------- ----------- ---------- ---------- ---- ----------- ----------- ---------- ----------  1    default     a           0.10       0.20       5    plan_x      a           0.09       0.19  2    default     b           0.55       0.66       NULL NULL        NULL        NULL       NULL  3    default     c           1.99       2.99       6    plan_x      c           0.99       1.99  4    default     d           9.99       6.33       NULL NULL        NULL        NULL       NULL  

Because I want to abstract this away I want to put this into a table valued function so I can create a "dynamic view":

select * from dbo.get_tariffplan_for('plan_x', default);  

This should result in a "virtual table" (or "dynamic view") similar to the tariff_values table, thus: not having two tariff_foo's and two tariff_bar's and let the application decide which one to use. And so, I resort to ISNULL and Case when... constructs to "override" the default values:

select p.tariff_type,      ISNULL(s.tariff_foo, p.tariff_foo) as tariff_foo,       ISNULL(s.tariff_bar, p.tariff_bar) as tariff_bar,      ISNULL(s.tariff_plan, p.tariff_plan) as tariff_plan,      CASE WHEN s.id IS NULL THEN 1 ELSE 0 END as isfallback  from tariff_values as p  left outer join tariff_values s      on (p.tariff_type = s.tariff_type) and (s.tariff_plan = 'plan_x')  where (p.tariff_plan = 'default')  

This results in:

tariff_type tariff_foo tariff_bar tariff_plan isfallback  ----------- ---------- ---------- ----------- -----------  a           0.09       0.19       plan_x      0  b           0.55       0.66       default     1  c           0.99       1.99       plan_x      0  d           9.99       6.33       default     1  

All I need to do now is stuff this query into a TVF:

CREATE FUNCTION get_tariffplan_for  (         @customerplan as varchar(50),      @defaultplan as varchar(50) = 'default'  )  RETURNS TABLE   AS RETURN   (      select p.tariff_type,          ISNULL(s.tariff_foo, p.tariff_foo) as tariff_foo,           ISNULL(s.tariff_bar, p.tariff_bar) as tariff_bar,          ISNULL(s.tariff_plan, p.tariff_plan) as tariff_plan,          CASE WHEN s.id IS NULL THEN 1 ELSE 0 END as isfallback      from tariff_values as p      left outer join tariff_values s          on (p.tariff_type = s.tariff_type) and (s.tariff_plan = @customerplan)      where (p.tariff_plan = @defaultplan)  );  

And there we have it. We can call our function ("dynamic view") as intended (and also use it in selects/joins etc.)

select * from dbo.get_tariffplan_for('plan_x', default);    --or:    select *  from foo  inner join dbo.get_tariffplan_for('plan_x', default) bar      on foo.tariff_type = bar.tariff_type  

Now my first question is:

I have a feeling all these ISNULL (or COALESCE) and/or CASE WHEN ... stunts seem to complicate things unnecessarily and something tells me this can be done more efficiently. However, I can't come up with a better and/or more efficient alternative.

So I'm hoping someone here has some ideas on how to improve this.

My second question is:

What if I had a product (tariff_type q for example) that I sold exclusively to some customer; the tariff wouldn't be in the default tariff-plan so I'd have to add another select to the above (with a union) to get all exclusive tariffs for that customer in the resultset. That would result in a query like this:

select p.tariff_type,      ISNULL(s.tariff_foo, p.tariff_foo) as tariff_foo,       ISNULL(s.tariff_bar, p.tariff_bar) as tariff_bar,      ISNULL(s.tariff_plan, p.tariff_plan) as tariff_plan,      CASE WHEN s.id IS NULL THEN 1 ELSE 0 END as isfallback,      0 as isexclusive  from tariff_values as p  left outer join tariff_values s      on (p.tariff_type = s.tariff_type) and (s.tariff_plan = @customerplan)  where (p.tariff_plan = @defaultplan)    UNION    --Exclusive values  select p.tariff_type,      p.tariff_foo,       p.tariff_bar,      p.tariff_plan,      0 as isfallback,      1 as isexclusive  from tariff_values p  left outer join tariff_values s      on (p.tariff_type = s.tariff_type) AND (s.tariff_plan = 'default')  where p.tariff_plan = 'plan_x'      and s.id is null  

(Demonstrated in this SQL fiddle)

In the above example I use another left join with s.id is null (but that could be rewritten in other ways (using count, exists, not in, having...whatever)) to retrieve the customer-exclusive tariffs. But maybe there's a better way instead of the union?

Distributed transaction and Read Committed Snapshot

Posted: 14 Aug 2013 08:56 AM PDT

Distributed transactions are not supported for snapshot isolation level in SQL Server 2008 R2. But what about read committed snapshot?

How do I determine how much data is being written per day through insert, update and delete operations?

Posted: 14 Aug 2013 09:57 AM PDT

The longevity of SSDs is largely determined by the amount of bytes written by insert, update and delete operations. What is the best way to accurately determine how much data is being written by MariaDB 5.5 on a daily basis so that I can use this to estimate the possible longevity of SSDs if used in a heavy write database environment?

Current setup is that all tables are InnoDB. Can I use Innodb_data_written and Uptime to determine a rough idea of bytes written per day, or is there a better way of doing it?

Oracle Patch Update

Posted: 14 Aug 2013 11:15 AM PDT

We have an Oracle RAC production environment with primary and secondary DB. Our DBA has asked to update oracle version from 11.2.0.1.0(64 bit) to 11.2.0.3(64 bit) with patch 6880880,10404530,16803769 and 16803775.

In our current database we have Shared storage,ACL settings, security settings,Gateway/Heteregenous connectivity, Dataguard, Data broker, Backup policy and Oracle Client installed on other machines.

DBA has estimated that he need to do installation, settings from scratch and test.. So, when the version is updated, do we really need to reconfig and install everything (Shared storage,ACL settings, security settings,Gateway/Heteregenous connectivity, Dataguard, Data broker, Backup policy and Oracle Client installed on other machines) ?? If yes its fine, but no then I need to justify it.

I can understand testing would be required..

Updateable Subscriptions: Rows do not match between Publisher and Subscriber

Posted: 14 Aug 2013 09:05 PM PDT

I have transactional replication with updatable subscribers set up in SQL Server 2008. It has 3 subscribers and 1 publisher. I had to setup replication again due to some errors related to the database and the application which uses the database. However, now I run into issues when I try updating a section in my application. It does not go through the with update and gives the following error:

Execption Message: Updateable Subscriptions: Rows do not match between Publisher and Subscriber. Run the Distribution Agent to refresh rows at the Subscriber. Updateable Subscriptions: Rolling back transaction. The transaction ended in the trigger. The batch has been aborted. Exception Triggered by Method: OnError

The update statement obviously doesn't go through. However, when I try it the second time it works. Replication is working. Everything seems to be replication.

Can anyone explain why this error would occur and how I can resolve this issue. I would really appreciate the help!...

Replicated Database Log File Maintenance

Posted: 14 Aug 2013 03:05 PM PDT

I have a database on the publisher that is involved in replication (publication configured for merge and transaction). Trying to regain control of the log file for this particular database (VLF count, size, etc.).

Is there anything I need to do (or be cautious of) with the replication setup before trying to perform any maintenance on the log file? I am not an expert in the area of replication and cannot find anything solid that provides guidance as to what measures should be taken.

Edit: This would include working on the distribution database as well, data retention was not configured at all for some reason.

How to avoid empty rows in SSIS Excel Destination?

Posted: 14 Aug 2013 08:05 PM PDT

Does anyone have a way to avoid empty rows when using SSIS to export to Excel. Here's a simple example of one data flow task:

OLE DB Source:

OLE DB Source:

Data Conversion (to handle the annoying UNICODE / NON-UNICODE deal):

Data Conversion

The end result is either of the two below depending on value of "FirstRowHasColumnName" in the Excel Connection Manager. Note, the blank rows.

output 1

output 2

How to add rows/columns to the table in runtime in SSRS 2008

Posted: 14 Aug 2013 10:05 AM PDT

Usually we design the table to have x number of rows and y number of columns in a report. But how can we create a report which adds the rows and columns dynamically at run time based on the result of the source query?

For example I want to list stdentId, StudentName and any course each student has enrolled in. As the number of courses is different from one person to the other, I should add the rows and related column for courses at run time based on the query result. How can it be done? For example:

enter image description here

Thanks for your help in advance.

How to disable oracle's MAX_ENABLED_ROLES limit

Posted: 14 Aug 2013 04:05 PM PDT

How to disable oracle's MAX_ENABLED_ROLES limit or expand the value of limitation. [oracle 10g (win32)]

In MySQL, does the order of the columns in a WHERE clause affect query performance,why?

Posted: 14 Aug 2013 01:05 PM PDT

I have a query that doesn't use any indexes:

SELECT 32,         guid,         1,         1,         1,         0,         5  FROM   test  WHERE  level >= 20         AND ( ( fun_GetIndexValue(data, 354) >> 16 ) +                ( fun_GetIndexValue(data, 355) >> 16 ) +                ( fun_GetIndexValue(data, 356) >> 16 ) +                ( fun_GetIndexValue(data, 357) >> 16 ) +                ( fun_GetIndexValue(data, 358) >> 16 ) +                ( fun_GetIndexValue(data, 359) >> 16 ) ) >= 1;   

The level column has only about 80-90 distinct values, the table test has about million rows, and the data column is passed to the function, so I think the query can not use any indexes. But I found that if I put the level condition in the end, the query performs slower. Why is that?

effective mysql table/index design for 35 million rows+ table, with 200+ corresponding columns (double), any combination of which may be queried

Posted: 14 Aug 2013 07:05 PM PDT

I am looking for advice on table/index design for the following situation:

i have a large table (stock price history data, InnoDB, 35 million rows and growing) with a compound primary key (assetid (int),date (date)). in addition to the pricing information, i have 200 double values that need to correspond to each record.

CREATE TABLE `mytable` (  `assetid` int(11) NOT NULL,  `date` date NOT NULL,  `close` double NOT NULL,  `f1` double DEFAULT NULL,     `f2` double DEFAULT NULL,  `f3` double DEFAULT NULL,     `f4` double DEFAULT NULL,   ... skip a few …  `f200` double DEFAULT NULL,   PRIMARY KEY (`assetid`, `date`)) ENGINE=`InnoDB` DEFAULT CHARACTER SET latin1 COLLATE      latin1_swedish_ci ROW_FORMAT=COMPACT CHECKSUM=0 DELAY_KEY_WRITE=0       PARTITION BY RANGE COLUMNS(`date`) PARTITIONS 51;  

i initially stored the 200 double columns directly in this table for ease of update and retrieval, and this had been working fine, as the only querying done on this table was by the assetid and date (these are religiously included in any query against this table), and the 200 double columns were only read. My database size was around 45 Gig

However, now i have the requirement where i need to be able to query this table by any combination of these 200 columns (named f1,f2,...f200), for example:

select from mytable   where assetid in (1,2,3,4,5,6,7,....)  and date > '2010-1-1' and date < '2013-4-5'  and f1 > -0.23 and f1 < 0.9  and f117 > 0.012 and f117 < .877  etc,etc  

i have not historically had to deal with this large of an amount of data before, so my first instinct was that indexes were needed on each of these 200 columns, or i would wind up with large table scans, etc. To me this meant that i needed a table for each of the 200 columns with primary key, value, and index the values. So i went with that.

CREATE TABLE `f1` (  `assetid` int(11) NOT NULL DEFAULT '0',  `date` date NOT NULL DEFAULT '0000-00-00',  `value` double NOT NULL DEFAULT '0',  PRIMARY KEY (`assetid`, `date`),  INDEX `val` (`value`)  ) ENGINE=`InnoDB` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci ROW_FORMAT=COMPACT CHECKSUM=0 DELAY_KEY_WRITE=0;  

i filled up and indexed all 200 tables. I left the main table intact with all 200 columns, as regularly it is queried over assetid and date range and all 200 columns are selected. I figured that leaving those columns in the parent table (unindexed) for read purposes, and then additionally having them indexed in their own tables (for join filtering) would be most performant. I ran explains on the new form of the query

select count(p.assetid) as total   from mytable p   inner join f1 f1 on f1.assetid = p.assetid and f1.date = p.date  inner join f2 f2 on f2.assetid = p.assetid and f2.date = p.date   where p.assetid in(1,2,3,4,5,6,7)  and p.date >= '2011-01-01' and p.date < '2013-03-14'   and(f1.value >= 0.96 and f1.value <= 0.97 and f2.value >= 0.96 and f2.value <= 0.97)   

Indeed my desired result was achieved, explain shows me that the rows scanned are much smaller for this query. However i wound up with some undesirable side effects.

1) my database went from 45 Gig to 110 Gig. I can no longer keep the db in RAM. (i have 256Gig of RAM on the way however)

2) nightly inserts of new data now need to be done 200 times instead of once

3) maintenance/defrag of the new 200 tables take 200 times longer than just the 1 table. It cannot be completed in a night.

4) queries against the f1, etc tables are not necessarily performant. for example:

 select min(value) from f1    where assetid in (1,2,3,4,5,6,7)    and date >= '2013-3-18' and date < '2013-3-19'  

the above query, while explain shows that it lookgin at < 1000 rows, can take 30+ seconds to complete. I assume this is because the indexes are too large to fit in memory.

Since that was alot of bad news, I looked further and found partitioning. I implemented partitions on the main table, partitioned on date every 3 months. Monthly seemed to make sense to me but i have read that once you get over 120 partitions or so, performance suffers. partitioning quarterly will leave me under that for the next 20 years or so. each partition is a bit under 2 Gig. i ran explain partitions and everything seems to be pruning properly, so regardless i feel the partitioning was a good step, at the very least for analyze/optimize/repair purposes.

I spent a good deal of time with this article

http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/testing-partitions-large-db.html

my table currently is partitioned with primary key still on it. The article mentions that primary keys can make a partitioned table slower, but if you have a machine that can handle it, primary keys on the partitioned table will be faster. Knowing i have a big machine on the way (256 G RAM), i left the keys on.

so as i see it, here are my options

Option 1

1) remove the extra 200 tables and let the query do table scans to find the f1, f2 etc values. non-unique indexes can actually hurt performance on a properly partitioned table. run an explain before the user runs the query and deny them if the number of rows scanned is over some threshold i define. save myself the pain of the giant database. Heck, it will all be in memory soon anyways.

sub-question:

does it sound like i have chosen an appropriate partition scheme?

Option 2

Partition all the 200 tables using the same 3 months scheme. enjoy the smaller row scans and allow the users to run larger queries. now that they are partitioned at least i can manage them 1 partition at a time for maintenance purposes. Heck, it will all be in memory soon anyways. develop efficient way to update them nightly.

sub-question:

do you see a reason that i may avoid primary key indexes on these f1,f2,f3,f4... tables, knowing that i always have assetid and date when querying? seems counter intuitive to me but i am not used to data sets of this size. that would shrink the database a bunch i assume

Option 3

Drop the f1,f2,f3 columns in the master table to reclaim that space. do 200 joins if i need to read 200 features, maybe it wont be as slow as it sounds.

Option 4

You all have a better way to structure this than i have thought of so far.

* NOTE: i will soon be adding another 50-100 of these double values to each item, so i need to design knowing that is coming

thanks for any and all help

Update #1 - 3/24/2103

I went with the idea suggested in the comments i got below and created one new table with the following setup:

create table 'features'{    assetid int,    date    date,    feature varchar(4),    value   double  }  

I partitioned the table in 3 month intervals.

I blew away the earlier 200 tables so that my database was back down to 45 Gig and started filling up this new table. A day and a half later, it completed, and my database now sits at a chubby 220 Gigs!

It does allow the possibility of removing these 200 values from the master table, as i can get them from one join, but that would really only give me back 25 Gigs or so maybe

I asked it to create a primary key on assetid, date,feature and an index on value, and after 9 hours of chugging it really hadn't made a dent and seemed to freeze up so i killed that part off.

i rebuilt a couple of the partitions but it did not seem to reclaim much/any space.

So that solution looks like it probably isn't going to be ideal. Do rows take up significantly more space than columns i wonder, could that be why this solution took up so much more space?

I came across this article

http://www.chrismoos.com/2010/01/31/mysql-partitioning-tables-with-millions-of-rows

it gave me an idea.

where he says

"At first, I thought about RANGE partitioning by date, and while I am using the date in my queries, it is very common for a query to have a very large date range, and that means it could easily span all partitions."

Now i am range partitioning by date as well, but will also be allowing searches by large date range, which will decrease the effectiveness of my partitioning. I will always have a date range when i search, however i will also always have a list of assetids. Perhaps my solution should be to partition by assetid and date, where i identify typically searched assetid ranges (which i can come up with, there are standard lists, S&P 500, russell 2000, etc). this way i would almost never look at the entire data set.

Then again, i am primary keyed on assetid and date anyways, so maybe that wouldnt help much.

any more thoughts/comments would be appreciated

thanks

Delete word, its meanings, its meaning's example sentences from DB

Posted: 14 Aug 2013 05:05 PM PDT

I have three tables as below (simplified for demonstration):

words  =====  integer id  text    word    meanings  ========  integer id  integer word_id  text    meaning    examples  ========  integer id  integer meaning_id  text    sentence  

where, word_id stores id of the word in words table and meaning_id stores id of the meaning in meanings table. I am trying to figure out a sql query, given a word's id, to delete the word with all its meanings and example sentences all at one time. Is such sql query possible to compose? If so, how?

Edit1: I am using SQLite3 as the database.

Edit2: I figured the following solution which requires 3 sql queries in order:

DELETE FROM examples WHERE meaning_id IN (SELECT id FROM meanings WHERE word_id=the_given_id);  DELETE FROM meanings WHERE word_id=the_given_id;  DELETE FROM words WHERE id=the_given_id;  

I'm still looking for the answer to my question: is the whole process possible to be done in one query?

How can I optimize this query and support multiple SKUs?

Posted: 14 Aug 2013 12:05 PM PDT

My current query only can select one SKU at a time. I can leave salesite_id constant. If there is a way to also have varying salesite_ids that would be good too, but not necessary. Also any suggestions on indexes would be much appreciated also.

SELECT       available - (          SELECT COALESCE(sum(quantity), 0)           FROM product_locks           WHERE sku = 'sku1'      ) - (          SELECT COALESCE(sum(quantity), 0)           FROM               orderlineitems               INNER JOIN responses_authnets ON responses_authnets.id = orderlineitems.response_id           WHERE               sku = 'sku1' AND responses_authnets.salesite_id = 'site_id_1'      ) AS free,       available AS total,       sku,       on_hold   FROM product_inventories   WHERE sku = 'sku1' AND salesite_id = 'site_id_1';  

How to modify an update in Oracle so it performs faster?

Posted: 14 Aug 2013 02:05 PM PDT

I have this query:

UPDATE   (      SELECT   h.valid_through_dt, h.LAST_UPDATE_TMSTMP      FROM   ETL_FEE_SCH_TMP d, FEE_SCHEDULE_HISTORICAL h      WHERE       h.FUND_ID = d.FUND_ID      AND h.FEETYPE_NAME = d.FEETYPE_NAME      AND h.BREAKPOINT_TYPE = d.BREAKPOINT_TYPE      AND h.BREAKPOINT_QTY = d.BREAKPOINT_QTY      AND h.LOW_BREAKPOINT_AMT = d.LOW_BREAKPOINT_AMT      AND h.VALID_THROUGH = TO_DATE ('31-DEC-9999', 'dd-mon-yyyy')      AND h.universe = 'DC'      AND h.universe = d.universe      AND EXISTS      (          SELECT 1          FROM FEE_SCHEDULE s          WHERE s.FUND_ID = h.FUND_ID          AND s.FEETYPE_NAME = h.FEETYPE_NAME          AND s.BREAKPOINT_TYPE = h.BREAKPOINT_TYPE          AND s.BREAKPOINT_QTY = h.BREAKPOINT_QTY          AND s.LOW_BREAKPOINT_AMT = h.LOW_BREAKPOINT_AMT          AND s.universe = 'DC'      )  ) updateTable  SET     updateTable.VALID_THROUGH = (SYSDATE - 1),  updateTable.LAST_UPDATE_TMSTMP = SYSTIMESTAMP;  

The trouble that I am having is that this query takes a long time to run. I don't know whether it is possible to run this on parallel, or it would be easier to update a cursor in a pipeline function.

What would you suggest?

This is all the information that I believe it is relevant.

This is the execution plan of the internal select:

Execution Plan  ----------------------------------------------------------  Plan hash value: 57376096  ---------------------------------------------------------------------------------------------------------  | Id  | Operation                    | Name                     | Rows  | Bytes| Cost (%CPU)| Time     |  ---------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT             |                          |     1 |   306 |  8427   (1)| 00:01:42 |  |   1 |  NESTED LOOPS                |                          |       |       |            |          |  |   2 |   NESTED LOOPS               |                          |     1 |    306|  8427   (1)| 00:01:42 |  |   3 |    MERGE JOIN CARTESIAN      |                          |     1 |    192|  8426   (1)| 00:01:42 |  |*  4 |     INDEX RANGE SCAN         | SYS_C000666              |     1 |     96|     2   (0)| 00:00:01 |  |   5 |     BUFFER SORT              |                          |  3045K|   278M|  8425   (1)| 00:01:42 |  |   6 |      SORT UNIQUE             |                          |  3045K|   278M|  8425   (1)| 00:01:42 |  |*  7 |       TABLE ACCESS FULL      | FEE_SCHEDULE             |  3045K|   278M|  8425   (1)| 00:01:42 |  |*  8 |    INDEX RANGE SCAN          | FEE_SCHDL_IDX1           |     1 |       |     1   (0)| 00:00:01 |  |*  9 |   TABLE ACCESS BY INDEX ROWID| FEE_SCHEDULE_HISTORICAL  |     1 |   114 |     1   (0)| 00:00:01 |  ---------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id):  ---------------------------------------------------     4 - access("D"."UNIVERSE"='DC')     7 - filter("S"."UNIVERSE"='DC')     8 - access("H"."UNIVERSE"='DC' AND "S"."FUND_ID"="H"."FUND_ID" AND                "S"."FEETYPE_NAME"="H"."FEETYPE_NAME" AND                "S"."BREAKPOINT_TYPE"="H"."BREAKPOINT_TYPE" AND                "S"."BREAKPOINT_QTY"="H"."BREAKPOINT_QTY" AND                "S"."LOW_BREAKPOINT_AMT"="H"."LOW_BREAKPOINT_AMT")         filter("H"."FUND_ID"="D"."FUND_ID" AND                "H"."FEETYPE_NAME"="D"."FEETYPE_NAME" AND                "H"."BREAKPOINT_TYPE"="D"."BREAKPOINT_UNIT_TY  

Table data:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  UNIVERSE|FUND_ID   |FEETYPE_NAME |BREAKPOINT_TYPE|BREAKPOINT_QTY|LOW_BREAKPOINT_AMT|HIGH_BREAKPOINT_AMT|FEE_PCT|FEE_SCHDL_SEQ_ID|GROUP_ID|LAST_UPDATE_TMSTMP  |VALID_FROM|VALID_THROUGH|INSERT_TMSTMP        |JOB_ID|  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  DC      |DC9ZTPLPHO|DeferLoad    |Percentage     |4             |10000             |300000             |3.14   |780250          |null    |1/4/2012  3:59:54 PM|6/23/2012 |12/31/9999   |1/5/2011   3:59:54 PM|666   |  DC      |DCE86Y8XFU|RedemptionFee|Percentage     |9             |  100             |100500             |7.67   |780251          |null    |6/4/2012  4:49:54 PM|11/12/2011|12/31/9999   |8/17/2011  2:00:54 PM|666   |  DC      |DCAYL0KONA|FrontLoad    |Percentage     |2             |50000             |601500             |5.00   |780252          |null    |4/25/2012 4:49:54 PM|8/2/2012  |12/31/9999   |12/19/2012 9:59:00 PM|666   |  DC      |DC9ZTPLPHO|DeferLoad    |Percentage     |7             |80000             |900000             |2.24   |780252          |null    |4/25/2012 4:49:54 PM|8/2/2012  |12/31/9999   |12/19/2012 9:59:00 PM|666   |  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  

This is the script of the historical table:

CREATE TABLE FEE_SCHEDULE_HISTORICAL  (    UNIVERSE                        VARCHAR2(2 BYTE) NOT NULL,    FUND_ID                         VARCHAR2(10 BYTE) NOT NULL,    FEETYPE_NAME                    VARCHAR2(75 BYTE),    BREAKPOINT_TYPE                 VARCHAR2(50 BYTE),    BREAKPOINT_QTY                  VARCHAR2(10 BYTE),    LOW_BREAKPOINT_AMT              NUMBER(19,6),    HIGH_BREAKPOINT_AMT             NUMBER(19,6),    FEE_PCT                         NUMBER(19,6),    FEE_SCHDL_SEQ_ID                NUMBER        NOT NULL,    GROUP_ID                        NUMBER,    LAST_UPDATE_TMSTMP              DATE          NOT NULL,    VALID_FROM                      DATE          NOT NULL,    VALID_THROUGH                   DATE          NOT NULL,    INSERT_TMSTMP                   DATE          NOT NULL,    JOB_ID                          NUMBER        NOT NULL  );    CREATE UNIQUE INDEX FEE_SCHDL_PK ON FEE_SCHEDULE_HISTORICAL(FEE_SCHDL_SEQ_ID);    CREATE UNIQUE INDEX FEE_SCHDL_HST_IDX ON FEE_SCHEDULE_HISTORICAL (      UNIVERSE,      FUND_ID,      FEETYPE_NAME,      BREAKPOINT_TYPE,      BREAKPOINT_QTY,       LOW_BREAKPOINT_AMT,      VALID_FROM,      JOB_ID  )    CREATE INDEX FEE_SCHEDULE_HST_IDX2 ON FEE_SCHEDULE_HISTORICAL(LAST_UPDATE_TMSTMP)    CREATE INDEX FEE_SCHEDULE_HST_IDX3 ON FEE_SCHEDULE_HISTORICAL(VALID_THROUGH)    ALTER TABLE FEE_SCHEDULE_HISTORICAL ADD (      CONSTRAINT FEE_SCHDL_PK      PRIMARY KEY      (FEE_SCHDL_SEQ_ID)  );  

This is the other table:

CREATE TABLE FEE_SCHEDULE  (    UNIVERSE                        VARCHAR2(2 BYTE) NOT NULL,    FUND_ID                         VARCHAR2(10 BYTE) NOT NULL,    FEETYPE_NAME                    VARCHAR2(75 BYTE),    BREAKPOINT_TYPE                 VARCHAR2(50 BYTE),    BREAKPOINT_QTY                  VARCHAR2(10 BYTE),    LOW_BREAKPOINT_AMT              NUMBER(19,6),    HIGH_BREAKPOINT_AMT             NUMBER(19,6),    FEE_PCT                         NUMBER(19,6),    JOB_RUN_ID                      NUMBER        NOT NULL,    FILE_DATE                       DATE          NOT NULL,    CYCLE_DATE                      DATE          NOT NULL  )  

The temporary table is the result of FEE_SCHEDULE_HISTORICAL minus FEE_SCHEDULE

Query to find and replace text in all tables and fields of a mysql db

Posted: 14 Aug 2013 06:05 PM PDT

I need to run a query to find and replace some text in all tables of a mysql database.

I found this query, but it only looks for the text in the tbl_name table and just in the column field.

update tbl_name set column=REPLACE(column, 'fuschia', 'fuchsia');   

I need it to look in all tables and all fields: (everywhere in the database)

1 comment:

  1. Thank you for putting up a descriptive post on a very useful aspect of SQL. This actually helped me a lot to understand this topic.
    SSIS PostgreSql Write

    ReplyDelete

Search This Blog