Saturday, June 1, 2013

[how to] How to avoid large joins when determining which rows a user is authorized for?

[how to] How to avoid large joins when determining which rows a user is authorized for?


How to avoid large joins when determining which rows a user is authorized for?

Posted: 01 Jun 2013 09:08 PM PDT

Typical wep app situation: you have logged in users querying for resources exposed through an API. As a hypothetical example, company administrators quering for their employees' time sheets:

company <- employees <- time_sheets  company <- admin  

Say I wanted to get all the time sheets that I'm authorized for/I have access to as an admin of that company.

Without any optimizations, in the above I'd have to join admin with company with employees with time_sheets to determine all the time_sheets I can look at. You can see how for a wider schema the joins could add up very fast. I realize that in modern databases joins are ludicrously fast, but I'm guessing this is still a case you don't want to completely ignore.

Is there a good solution to this? One could potentially have a separate admin to time_sheets many to many join table just to speed-up those queries, but then maintaining it might be tricky.

Any suggestions?

Splitting SQL query with many joins into smaller ones helps?

Posted: 01 Jun 2013 05:23 PM PDT

We need to do some reporting every night on our SQL server 2008R2. Calculating the reports takes several hours. In order to shorten the time we precalculate a table. This table is created based on JOINining 12 quite big (tens of milions row) tables.

The calculation of this aggregation table took until few days ago cca 4 hours. Our DBA than split this big join into 3 smaller joins (each joining 4 tables). The temporar result is everytime saved into a temporary table, which is used in the next join.

The result of the DBA enhancment is, that the aggregation table is calculated in 15 minutes. I wondered how is that possible. DBA told me that it is because the number of data the server must process is smaller. In other words, that in the big original join the server has to work with more data than in summed smaller joins. However, I would presume that optimizer would take care of doing it effeciently with the original big join, splitting the joins on its own and sending only the number of columns needed to next joins.

The other thing he has done is that he created an index on one of the tmeporary tables. However, once again I would think that the optimizer will create the appropriate hash tables if needed and alltogether better optimize the computation.

I talked about this with our DBA, but he was himself uncertain about what cased the improvement in processing time. He just mentioned, that he would not blame the server as it can be overwhelming to compute such big data and that it is possible that the optimizer has hard time to predict the best execution plan ... . This I understand, but I would like to have more defining answer as to exactly why.

So, the questions are:

1. "What could possibly cause the big improvement?"

2. "Is it a standard procedure to split big joins into smaller?"

3. "Is the amount of data which srever has to process really smaller in case of multiple smaller joins?"

Here is the original query:

    Insert Into FinalResult_Base  SELECT             TC.TestCampaignContainerId,      TC.CategoryId As TestCampaignCategoryId,      TC.Grade,      TC.TestCampaignId,          T.TestSetId      ,TL.TestId      ,TSK.CategoryId      ,TT.[TestletId]      ,TL.SectionNo      ,TL.Difficulty      ,TestletName = Char(65+TL.SectionNo) + CONVERT(varchar(4),6 - TL.Difficulty)       ,TQ.[QuestionId]      ,TS.StudentId      ,TS.ClassId      ,RA.SubjectId      ,TQ.[QuestionPoints]       ,GoodAnswer  = Case When TQ.[QuestionPoints] Is null Then 0                        When TQ.[QuestionPoints] > 0 Then 1                         Else 0 End      ,WrongAnswer = Case When TQ.[QuestionPoints] = 0 Then 1                         When TQ.[QuestionPoints] Is null Then 1                       Else 0 End      ,NoAnswer    = Case When TQ.[QuestionPoints] Is null Then 1 Else 0 End      ,TS.Redizo      ,TT.ViewCount      ,TT.SpentTime      ,TQ.[Position]        ,RA.SpecialNeeds              ,[Version] = 1       ,TestAdaptationId = TA.Id      ,TaskId = TSK.TaskId      ,TaskPosition = TT.Position      ,QuestionRate = Q.Rate      ,TestQuestionId = TQ.Guid      ,AnswerType = TT.TestletAnswerTypeId  FROM       [TestQuestion] TQ WITH (NOLOCK)      Join [TestTask] TT WITH (NOLOCK)            On TT.Guid = TQ.TestTaskId      Join [Question] Q WITH (NOLOCK)         On TQ.QuestionId =  Q.QuestionId      Join [Testlet] TL WITH (NOLOCK)         On TT.TestletId  = TL.Guid       Join [Test]     T WITH (NOLOCK)         On TL.TestId     =  T.Guid      Join [TestSet] TS WITH (NOLOCK)         On T.TestSetId   = TS.Guid       Join [RoleAssignment] RA WITH (NOLOCK)  On TS.StudentId  = RA.PersonId And RA.RoleId = 1      Join [Task] TSK WITH (NOLOCK)       On TSK.TaskId = TT.TaskId      Join [Category] C WITH (NOLOCK)     On C.CategoryId = TSK.CategoryId      Join [TimeWindow] TW WITH (NOLOCK)      On TW.Id = TS.TimeWindowId       Join [TestAdaptation] TA WITH (NOLOCK)  On TA.Id = TW.TestAdaptationId      Join [TestCampaign] TC WITH (NOLOCK)        On TC.TestCampaignId = TA.TestCampaignId   WHERE      T.TestTypeId = 1    -- eliminuji ankety       And t.ProcessedOn is not null -- ne vsechny, jen dokoncene      And TL.ShownOn is not null      And TS.Redizo not in (999999999, 111111119)  END;  

The new splitted joins after DBA great work:

    SELECT             TC.TestCampaignContainerId,      TC.CategoryId As TestCampaignCategoryId,      TC.Grade,      TC.TestCampaignId,          T.TestSetId      ,TL.TestId      ,TL.SectionNo      ,TL.Difficulty      ,TestletName = Char(65+TL.SectionNo) + CONVERT(varchar(4),6 - TL.Difficulty) -- prevod na A5, B4, B5 ...      ,TS.StudentId      ,TS.ClassId      ,TS.Redizo      ,[Version] = 1 -- ?       ,TestAdaptationId = TA.Id      ,TL.Guid AS TLGuid      ,TS.TimeWindowId  INTO      [#FinalResult_Base_1]  FROM       [TestSet] [TS] WITH (NOLOCK)      JOIN [Test] [T] WITH (NOLOCK)           ON [T].[TestSetId] = [TS].[Guid] AND [TS].[Redizo] NOT IN (999999999, 111111119) AND [T].[TestTypeId] = 1 AND [T].[ProcessedOn] IS NOT NULL      JOIN [Testlet] [TL] WITH (NOLOCK)          ON [TL].[TestId] = [T].[Guid] AND [TL].[ShownOn] IS NOT NULL      JOIN [TimeWindow] [TW] WITH (NOLOCK)          ON [TW].[Id] = [TS].[TimeWindowId] AND [TW].[IsActive] = 1      JOIN [TestAdaptation] [TA] WITH (NOLOCK)          ON [TA].[Id] = [TW].[TestAdaptationId] AND [TA].[IsActive] = 1      JOIN [TestCampaign] [TC] WITH (NOLOCK)          ON [TC].[TestCampaignId] = [TA].[TestCampaignId] AND [TC].[IsActive] = 1      JOIN [TestCampaignContainer] [TCC] WITH (NOLOCK)          ON [TCC].[TestCampaignContainerId] = [TC].[TestCampaignContainerId] AND [TCC].[IsActive] = 1      ;     SELECT             FR1.TestCampaignContainerId,      FR1.TestCampaignCategoryId,      FR1.Grade,      FR1.TestCampaignId,          FR1.TestSetId      ,FR1.TestId      ,TSK.CategoryId AS [TaskCategoryId]      ,TT.[TestletId]      ,FR1.SectionNo      ,FR1.Difficulty      ,TestletName = Char(65+FR1.SectionNo) + CONVERT(varchar(4),6 - FR1.Difficulty) -- prevod na A5, B4, B5 ...      ,FR1.StudentId      ,FR1.ClassId      ,FR1.Redizo      ,TT.ViewCount      ,TT.SpentTime      ,[Version] = 1 -- ?       ,FR1.TestAdaptationId      ,TaskId = TSK.TaskId      ,TaskPosition = TT.Position      ,AnswerType = TT.TestletAnswerTypeId      ,TT.Guid AS TTGuid    INTO      [#FinalResult_Base_2]  FROM       #FinalResult_Base_1 FR1      JOIN [TestTask] [TT] WITH (NOLOCK)          ON [TT].[TestletId] = [FR1].[TLGuid]       JOIN [Task] [TSK] WITH (NOLOCK)          ON [TSK].[TaskId] = [TT].[TaskId] AND [TSK].[IsActive] = 1      JOIN [Category] [C] WITH (NOLOCK)          ON [C].[CategoryId] = [TSK].[CategoryId]AND [C].[IsActive] = 1      ;        DROP TABLE [#FinalResult_Base_1]    CREATE NONCLUSTERED INDEX [#IX_FR_Student_Class]  ON [dbo].[#FinalResult_Base_2] ([StudentId],[ClassId])  INCLUDE ([TTGuid])    SELECT             FR2.TestCampaignContainerId,      FR2.TestCampaignCategoryId,      FR2.Grade,      FR2.TestCampaignId,          FR2.TestSetId      ,FR2.TestId      ,FR2.[TaskCategoryId]      ,FR2.[TestletId]      ,FR2.SectionNo      ,FR2.Difficulty      ,FR2.TestletName      ,TQ.[QuestionId]      ,FR2.StudentId      ,FR2.ClassId      ,RA.SubjectId      ,TQ.[QuestionPoints] -- 1+ good, 0 wrong, null no answer      ,GoodAnswer  = Case When TQ.[QuestionPoints] Is null Then 0                        When TQ.[QuestionPoints] > 0 Then 1 -- cookie                        Else 0 End      ,WrongAnswer = Case When TQ.[QuestionPoints] = 0 Then 1                         When TQ.[QuestionPoints] Is null Then 1                       Else 0 End      ,NoAnswer    = Case When TQ.[QuestionPoints] Is null Then 1 Else 0 End      ,FR2.Redizo      ,FR2.ViewCount      ,FR2.SpentTime      ,TQ.[Position] AS [QuestionPosition]        ,RA.SpecialNeeds -- identifikace SVP              ,[Version] = 1 -- ?       ,FR2.TestAdaptationId      ,FR2.TaskId      ,FR2.TaskPosition      ,QuestionRate = Q.Rate      ,TestQuestionId = TQ.Guid      ,FR2.AnswerType  INTO      [#FinalResult_Base]  FROM       [#FinalResult_Base_2] FR2      JOIN [TestQuestion] [TQ] WITH (NOLOCK)          ON [TQ].[TestTaskId] = [FR2].[TTGuid]      JOIN [Question] [Q] WITH (NOLOCK)          ON [Q].[QuestionId] = [TQ].[QuestionId] AND [Q].[IsActive] = 1        JOIN [RoleAssignment] [RA] WITH (NOLOCK)          ON [RA].[PersonId] = [FR2].[StudentId]          AND [RA].[ClassId] = [FR2].[ClassId] AND [RA].[IsActive] = 1 AND [RA].[RoleId] = 1        drop table #FinalResult_Base_2;        truncate table [dbo].[FinalResult_Base];      insert into [dbo].[FinalResult_Base] select * from #FinalResult_Base;        drop table #FinalResult_Base;  

Retrieving 10 records from multiple tables

Posted: 01 Jun 2013 03:38 AM PDT

I have two tables where I store post data. What I want to do is to get 10 records from those two tables. Table 1 - all posts, Table 2 posts that user read. There are some scenarios;

  • User has never read a post. So all the records will be gotten from table 1. This is easy: select * from table1;
  • User has read some posts but we don't know how many. In this case maximum 3 posts will be fetched from table 2, and the rest of the records will come from table 1.

    We can do this by counting how many posts there are in table 2, if there are more than 3 records, get only 3 of them (if lower than 3, then get how many records there are), do the math to learn how many records will be fetched from table 1 and a second query for table 1 to get 7 (or more). But in this case we combine PHP with MySQL and do more work.

How can I get this done in one query?

Wrong in select statement [closed]

Posted: 01 Jun 2013 02:07 AM PDT

When a test this with SQL Navigator, it works fine:

select userid,* from user_ ;  

But with SQL Developer, it get an error: ORA-00936: missing expression

Why is that?

MySQL Table Grant All users

Posted: 01 Jun 2013 04:50 AM PDT

Is it possible to assign a Grant to a table for All users or a Default permission so when new users are created they will have the specific grants for that table SELECT,UPDATE,INSERT,DELETE

Postgres connection access denied on IPv6 address

Posted: 01 Jun 2013 08:55 PM PDT

Installed PostgreSQL 9.1 x64 on Windows, set up a listen address, but when connecting with pgAdmin I get the following error. Not sure why PostgreSQL is seeing my IPv6 address and not my regular IP address:

enter image description here

To get authentication working, based on the error message, I updated pg_hba.conf with this:

host all all fe80::c5d2:XXXX:XXXX:3bc0/12 trust

That worked, but it's ugly, and too specific. I tried the following based on PostgreSQL docs, but none worked, I get the same 'access denied' error:

local all all trust  host all all 0.0.0.0/12 trust  

I got this one working, which covers the entire IPv6 address space, but how can I specify an IPv6 range for more restriction?

host  mydb  myuser  ::/0   trust  

Questions

  • Why does pgAdmin pick up my IPv6 address and not my normal IP?
  • How do I specify a range in IPv6 without resorting to ::/0?

Is a geographically distributed application with SQL Server replication a good idea?

Posted: 01 Jun 2013 08:59 PM PDT

We are currently thinking about a target architecture for a geographically distributed application.

This application tracks and schedules the movements of all material units across various production sites.

There will be two logical entities:

  • The first one is responsible of the operational/local management of a specific store - there are as many instances of this entity as there are stores
  • The second one is responsible of the global management (for example: units transfers between stores) and therefore communicates with all operational/local entities

Here is the logical diagram of the application:

Logical architecture

The implementation of this application cannot be directly derived from the previous diagram because we have to take into account the following constraints:

  • Stores are located in different areas (typically a few stores per area) and if the area is temporarily isolated from the global network, the system should still be functional for a limited amount of time.
  • The operational entity requires the global entity to work

Therefore, I thought about a new architecture where we would create identical instances of the global entity (same executables) with synchronized databases for each area:

enter image description here

I was wondering if anyone already used SQL Server replication to achieve the same kind of goal.

Is it the right way to proceed ? All recommendations/warnings are welcome !

Thanks a lot, Sébastien.

Reduce Clustered Index seek cost SQL Server 2008 R2

Posted: 01 Jun 2013 05:24 PM PDT

I am running a query and it have records in lakhs, it takes more than 20 mins to fetch data. After running the execution plan i noticed that clustered index seek cost can be be the reason. How can I reduce the clustered index seek cost of below mentioned query?

Foreign keys costs around 13% to 23%.

CREATE PROC [dbo].[Test] (@UserTypeID  INT,                            @UserID      INT,                            @CityID      INT,                            @OperatorID  INT,                            @ParameterID INT)  AS    BEGIN        DECLARE @temp TABLE (          range  DECIMAL(18, 2),          range2 DECIMAL(18, 2),          image  VARCHAR(50),          symbol VARCHAR(20))          IF( @UserID > 0 )          BEGIN              --print 'hii'                  INSERT INTO @temp                          (range,                           range2,                           image,                           symbol)              SELECT tbl_Legend_ViewNetwork_Dtls.range,                     tbl_Legend_ViewNetwork_Dtls.range2,                     tbl_Legend_ViewNetwork_Dtls.image,                     tbl_Legend_ViewNetwork_Dtls.symbol              FROM   tbl_Legend_ViewNetwork_Dtls                     INNER JOIN tbl_Legend_ViewNetwork                       ON tbl_Legend_ViewNetwork_Dtls.tbl_legend_view_network_id = tbl_Legend_ViewNetwork.id              WHERE  tbl_Legend_ViewNetwork.parameter_id = @ParameterID                     AND tbl_Legend_ViewNetwork.user_type_id = @UserTypeID                     AND tbl_Legend_ViewNetwork.is_default = 1                     AND tbl_Legend_ViewNetwork.user_id = @UserID                UPDATE @temp              SET    range = range2,                     range2 = range              WHERE  symbol = '<'          END        ELSE          BEGIN              INSERT INTO @temp                          (range,                           range2,                           image,                           symbol)              SELECT tbl_Legend_ViewNetwork_Dtls.range,                     tbl_Legend_ViewNetwork_Dtls.range2,                     tbl_Legend_ViewNetwork_Dtls.image,                     tbl_Legend_ViewNetwork_Dtls.symbol              FROM   tbl_Legend_ViewNetwork_Dtls                     INNER JOIN tbl_Legend_ViewNetwork                       ON tbl_Legend_ViewNetwork_Dtls.tbl_legend_view_network_id = tbl_Legend_ViewNetwork.id              WHERE  tbl_Legend_ViewNetwork.parameter_id = @ParameterID                     AND tbl_Legend_ViewNetwork.user_type_id = @UserTypeID                     AND tbl_Legend_ViewNetwork.is_default = 1                UPDATE @temp              SET    range = range2,                     range2 = range              WHERE  symbol = '<'          END          --select * from @temp                  SELECT '[' + STUFF((SELECT ',{"latitude":"' + a.lat + '","longitude":"' + a.long + '","value":"' + CONVERT(VARCHAR(20), a.value) + '","image":"' + temp.image + '"}'                            FROM   (SELECT tbl_Survey_Details.lat,                                           tbl_Survey_Details.long,                                           tbl_Survey_Details.value                                    FROM   tbl_Survey_Details                                           INNER JOIN tbl_Survey                                             ON tbl_Survey_Details.tbl_survey_id = tbl_Survey.id                                           INNER JOIN tbl_Location                                             ON tbl_Survey.tbl_location_id = tbl_Location.id                                           INNER JOIN tbl_Area                                             ON tbl_Location.tbl_area_id = tbl_Area.id                                           INNER JOIN tbl_City                                             ON tbl_Area.tbl_city_id = tbl_City.id                                    WHERE  tbl_Survey_Details.tbl_parameter_id = @ParameterID                                           AND tbl_Survey.tbl_mobile_operator_id = @OperatorID                                           AND tbl_Area.tbl_city_id = @CityID) AS a                                   INNER JOIN @temp temp                                     ON a.value BETWEEN temp.range AND temp.range2                            FOR XML Path ('')), 1, 1, '') + ']' AS data    END   

Execution plan screenshot

Revised Query:

CREATE PROC [dbo].[Test] (@UserTypeID  INT,                            @UserID      INT,                            @CityID      INT,                            @OperatorID  INT,                            @ParameterID INT)  AS    BEGIN        DECLARE @temp TABLE (          range  DECIMAL(18, 2),          range2 DECIMAL(18, 2),          image  VARCHAR(50),          symbol VARCHAR(20))          IF( @UserID > 0 )          BEGIN              --print 'hii'                  INSERT INTO @temp                          (range,                           range2,                           image,                           symbol)              SELECT tbl_Legend_ViewNetwork_Dtls.range,                     tbl_Legend_ViewNetwork_Dtls.range2,                     tbl_Legend_ViewNetwork_Dtls.image,                     tbl_Legend_ViewNetwork_Dtls.symbol              FROM   tbl_Legend_ViewNetwork_Dtls                     INNER JOIN tbl_Legend_ViewNetwork                       ON tbl_Legend_ViewNetwork_Dtls.tbl_legend_view_network_id = tbl_Legend_ViewNetwork.id              WHERE  tbl_Legend_ViewNetwork.parameter_id = @ParameterID                     AND tbl_Legend_ViewNetwork.user_type_id = @UserTypeID                     AND tbl_Legend_ViewNetwork.is_default = 1                     AND tbl_Legend_ViewNetwork.user_id = @UserID                UPDATE @temp              SET    range = range2,                     range2 = range              WHERE  symbol = '<'          END        ELSE          BEGIN              INSERT INTO @temp                          (range,                           range2,                           image,                           symbol)              SELECT tbl_Legend_ViewNetwork_Dtls.range,                     tbl_Legend_ViewNetwork_Dtls.range2,                     tbl_Legend_ViewNetwork_Dtls.image,                     tbl_Legend_ViewNetwork_Dtls.symbol              FROM   tbl_Legend_ViewNetwork_Dtls                     INNER JOIN tbl_Legend_ViewNetwork                       ON tbl_Legend_ViewNetwork_Dtls.tbl_legend_view_network_id = tbl_Legend_ViewNetwork.id              WHERE  tbl_Legend_ViewNetwork.parameter_id = @ParameterID                     AND tbl_Legend_ViewNetwork.user_type_id = @UserTypeID                     AND tbl_Legend_ViewNetwork.is_default = 1                UPDATE @temp              SET    range = range2,                     range2 = range              WHERE  symbol = '<'          END       SELECT   a.lat, a.long,a.value, temp.image                            FROM   (SELECT tbl_Survey_Details.lat,                                           tbl_Survey_Details.long,                                           tbl_Survey_Details.value                                    FROM   tbl_Survey_Details                                           INNER JOIN tbl_Survey                                             ON tbl_Survey_Details.tbl_survey_id = tbl_Survey.id                                           INNER JOIN tbl_Location                                             ON tbl_Survey.tbl_location_id = tbl_Location.id                                           INNER JOIN tbl_Area                                             ON tbl_Location.tbl_area_id = tbl_Area.id                                           INNER JOIN tbl_City                                             ON tbl_Area.tbl_city_id = tbl_City.id                                    WHERE  tbl_Survey_Details.tbl_parameter_id = @ParameterID                                           AND tbl_Survey.tbl_mobile_operator_id = @OperatorID                                           AND tbl_Area.tbl_city_id = @CityID) AS a                                   INNER JOIN @temp temp                                     ON a.value BETWEEN temp.range AND temp.range2      END   

Revised Execution Plan: Revised Execution Plan

How to add custom sequence number in table which is going to be populated automatically

Posted: 01 Jun 2013 04:56 PM PDT

I have table something like this

CREATE TABLE [dbo].[ado_test_table](      [id] [int] IDENTITY(1,1) NOT NULL,      [tip] [varchar](10) NOT NULL,      [datum] [datetime] NOT NULL,      [doc_number] [int] NULL  ) ON [PRIMARY]  

I want to column doc_number have sequential numbers for each tip.
To be clearer I want to automatically or default or computed column whit function like this

create FUNCTION [dbo].[fn_test_number]  (       @tip varchar(10)      ,@datum datetime  )  RETURNS int  AS  BEGIN      DECLARE @number int    select              @number=    max(doc_number)+1              from ado_test_table              where                      year(datum) = year(@datum)                  and tip = @tip    set @number=isnull(@number,1);    return @number  END  

Can I in MSSQL2005 have column which will automatically have values. Shod this be done by computed column, default (constraint) or trriger ?

Exporting database Oracle 7.3 on Windows NT 3.51

Posted: 01 Jun 2013 01:51 AM PDT

How to take dump files on Oracle7.3 running on Windows NT 3.51?

Should you stand in the Orant/bin folder and run one of the programs there like>

exp73 SYSTEM/password@[dbinstance] FULL=Y LOG=EXPORT.LOG  

Did not work

enter image description here These services: enter image description here

enter image description here

enter image description here

MySql is not optimizing the query properly

Posted: 01 Jun 2013 06:41 AM PDT

I have a table structure as follows :

CREATE TABLE `sale_product_inventories` (    `id` int(11) NOT NULL AUTO_INCREMENT,    `sale_id` int(11) NOT NULL,    `product_id` int(11) NOT NULL,    `size` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,    `tier_number` int(11) NOT NULL DEFAULT '1',    `sale_product_pool_id` int(11) DEFAULT NULL,    `inventory` int(11) NOT NULL,    `in_cart_units` int(11) DEFAULT '0',    `size_display_order` tinyint(4) NOT NULL DEFAULT '0',    `last_updated_by` int(11) DEFAULT '0',    `created_by` int(11) DEFAULT '0',    `status` enum('active','inactive') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'active',    `created_at` datetime DEFAULT NULL,    `updated_at` datetime DEFAULT NULL,    PRIMARY KEY (`id`),    UNIQUE KEY `UNIQUE` (`sale_id`,`product_id`,`tier_number`,`size`,`sale_product_pool_id`)  ) ENGINE=InnoDB AUTO_INCREMENT=92872 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;  

NOTE : I have an Index UNIQUE = sale_id,product_id,tier_number,size,sale_product_pool_id

When I run this query :

select * from sale_product_inventories   where   sale_id in (502,504)  and   (sale_id, product_id) in ((502,2),(502,1), (502,3),(502,4) ,(504,2) ,(504,3) )  

Query Plan for the query above MySql Uses the index Unique and the execution time is 0.7 millisecond

BUT

when I run this query

select * from sale_product_inventories   where   (sale_id, product_id) in ((502,2),(502,1), (502,3),(502,4) ,(504,2) ,(504,3) )  

Query Plan for the second query

MySql does not use the UNIQUE index and the execution time is 76 millisecond.

Mysql : 5.5.27 InnoDB Version : 1.1.8

My Question is Why is mysql behaving in such a way. Can some one please help me with this.

EDIT :
I came across this so thought it might be useful to add MySQL generally can't use indexes on columns unless the columns are isolated in the query. "Isolating" the column means it should not be part of an expression or be inside a function in the query.

Best practice: Unions or a derived table?

Posted: 01 Jun 2013 04:16 PM PDT

I've inherited a medium-sized database with a terrible schema. The sanitized portion in question is like so:

CREATE TABLE `pending` (    ...    `invoice` int(11) DEFAULT NULL,    `lid` int(11) DEFAULT NULL,    `custacct` varchar(21) DEFAULT NULL,    UNIQUE KEY `pend_inv` (`invoice`),    KEY `pend_acct` (`custacct`),    KEY `pend_pid` (`pid`)    ...  ) ENGINE=InnoDB    CREATE TABLE `done` (    ...    `invoice` int(11) DEFAULT NULL,    `lid` int(11) DEFAULT NULL,    `custacct` varchar(21) DEFAULT NULL,    UNIQUE KEY `done_inv` (`invoice`),    KEY `done_acct` (`custacct`),    KEY `done_pid` (`pid`)    ...  ) ENGINE=InnoDB  

And two similar tables, customers and locations:

CREATE TABLE `customers` (    `acct` varchar(14) NOT NULL,    ...    PRIMARY KEY (`acct`),    ...  ) ENGINE=InnoDB  

So, yes, when an invoice is not yet fulfilled, it's in the "pending" table, then when the company's finished with it, it gets moved to the "done" table. So any given invoice will only be on one of the two tables.

Unfortunately, we're still actively using the proprietary software that runs on this schema, so I am powerless to fix it. The "pending" table will usually have around 9000 rows, whereas the "done" table is ~800,000 and counting.

For two years (the full extent of my LAMP/MySQL experience), I have been writing UNIONs to capture information from the customers or locations tables, with the first query joining customers/location/both against "pending" and the second joining customers/locations/both against "done."

I've recently had a nagging feeling that writing two nearly-identical queries UNIONed together is probably newbish and should be done differently. So I finally applied my more recent learning to a problem I had "solved" in my first few weeks of working with MySQL and realized that I can just use a derived table with a UNION inside that subquery to achieve the same results with less typing.

The execution time is near-similar, usually 0.01s for the derived query and slightly less than that for the UNIONed query, but the dervied table query uses far fewer characters and requires less time to type out.

So my question for those with more experience, is using a derived table subquery more sane and maintainable in the long run? It's easier for me, the programmer, so that makes me suspicious that maybe it's not the best practice!

Sample queries below:

select c.whatever,l.whatever from customers c join    (select d.custacct as acct,d.lid from done d where d.invoice=123456       union     select p.custacct,p.lid from pending p where p.invoice=123456    ) as combi      on c.acct=combi.acct join locations l on combi.lid=l.lid;      select c.whatever,l.whatever from done d join customers c on d.custacct=c.acct join locations l on d.lid=l.lid where d.invoice=123456  union select c2.whatever,l2.whatever from pending p join customers c2 on p.custacct=c2.acct join locations l2 on p.lid=l2.lid where p.invoice=123456;  

Migration of SQL Server to Oracle Database

Posted: 01 Jun 2013 05:52 AM PDT

In Sql server, My Procedure will create temporary tables where column names are dynamically comes from every user, then inserts the data into the temporary tables and i use the data further based on the requirement. Finally, we drop the temporary tables. There is no problem when we create temporary tables with the same name and with different columns because in SQL Server the temporary tables are Session Based.

Now, I want to migrate the sql logic code to Oracle Database. Is their any solution to achieve this requirement in Oracle Database Server.

Answers are greatly appreciated.

Unable to drop the table since its showing waiting for meta data lock

Posted: 01 Jun 2013 12:16 AM PDT

We are trying to drop one table but it's getting hanged, and when we see the 'SHOW PROCESSLIST' command it's showing as 'waiting for meta data lock'. Even we are unable to perform any operation on that particular table. Does anybody know how to resolve it?

Fulltext stoplist replication

Posted: 01 Jun 2013 07:16 PM PDT

In MS SQL Server 2008 R2 there is replication of table with fulltext index on it. But stoplist that is associated with replicated fulltext index doesn't replicate.

Is there any possibility to replicate stoplist also?

SQL server ODBC connection for Active Directory user who doesn't log in to the windows

Posted: 01 Jun 2013 02:16 PM PDT

Do you think it is possible to create a SQL server ODBC Connection for an active directory user who doesn't log in into the windows.

Ideally this type of users will be used in the batch process. So, Another person logs in and creates a batch process and runs it with another user.

Note: I dont want to create a SQL server authentication. Instead would like to use active directory.

Thanks.

Process attempted to unlock a resource it does not own

Posted: 01 Jun 2013 01:16 PM PDT

SQL Server 2005 SP4 32-Bit

I have a DBCC CHECKDB job running nightly. Last night, soon after the job started, I got the errors below. The database is NOT in suspect mode, and CHECKDB comes back clean when I run it now. The database is fully accessible. Should I be concerned? I'd hate to go back to a backup at this point.

 2013-04-02 02:10:55.53 spid56      Error: 1203, Severity: 20, State: 1.   2013-04-02 02:10:55.53 spid56      Process ID 56 attempted to unlock a resource it                                      does not own: PAGE: 34:1:388664. Retry the                                       transaction, because this error may be caused                                       by a timing condition. If the problem persists,                                       contact the database administrator.   2013-04-02 02:10:55.58 spid56      Error: 3314, Severity: 17, State: 3.   2013-04-02 02:10:55.58 spid56      During undoing of a logged operation in                                       database 'MY_DATABASE_NAME', an error occurred                                       at log record ID (1342973:12519:37). Typically,                                       the specific failure is logged previously as                                       an error in the Windows Event Log service.                                       Restore the database or file from a backup,                                       or repair the database.  

Setting DATEFIRST to 6

Posted: 01 Jun 2013 05:16 PM PDT

I would like to set FIRSTDATE to 6 (ie: saturday), i have read that to change it parmanently i could set the default language of the login to one of an existing language from the sys.syslanguages and alter the login of the user like:

USE [master]  GO  ALTER LOGIN [some_user] WITH DEFAULT_LANGUAGE = [Français]  GO  

But:

SELECT * FROM sys.syslanguages;  

i get many languages, but no one has the DATEFIRST set to 6 (they are all set to 1 or 7).

Question:

Can i add another language based on an existing one, and set de field datefirst to 6 ?

Can i modify an existing syslanguage ?

All what i want, is te set it to 6 parmanently, because i need it to get the right week number.

Thank you for help !

mysql second slave not syncing while first slave works fine

Posted: 01 Jun 2013 09:11 PM PDT

I have a master (m) - slave (s1) setup using mysql 5.1.45

When I try to add a second slave (s2) the slave lags behind and never catches up on the sync.

Even after having synced the s2 with the whole system offline and there were (Seconds_Behind_Master = 0) after a few hours the s2 gets out of sync.

Strange is that s1 is always on sync.

any ideas?

SHOW SLAVE STATUS \G  (on slave2)  *************************** 1. row ***************************             Slave_IO_State: Waiting for master to send event                Master_Host: xxx.xxx.xxx.xxx                Master_User: xxxx_xxxx5                Master_Port: 3306              Connect_Retry: 60            Master_Log_File: mysql-bin.013165        Read_Master_Log_Pos: 208002803             Relay_Log_File: xxxxxxxxxx-relay-bin.000100              Relay_Log_Pos: 1052731555      Relay_Master_Log_File: mysql-bin.013124           Slave_IO_Running: Yes          Slave_SQL_Running: Yes            Replicate_Do_DB: xxxxxxxxx        Replicate_Ignore_DB:         Replicate_Do_Table:     Replicate_Ignore_Table:    Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table:                 Last_Errno: 0                 Last_Error:               Skip_Counter: 0        Exec_Master_Log_Pos: 1052731410            Relay_Log_Space: 44233859505            Until_Condition: None             Until_Log_File:              Until_Log_Pos: 0         Master_SSL_Allowed: No         Master_SSL_CA_File:         Master_SSL_CA_Path:            Master_SSL_Cert:          Master_SSL_Cipher:             Master_SSL_Key:      Seconds_Behind_Master: 69594  Master_SSL_Verify_Server_Cert: No              Last_IO_Errno: 0              Last_IO_Error:             Last_SQL_Errno: 0             Last_SQL_Error:  

iperf results between servers:

M -> s2  [ ID] Interval       Transfer     Bandwidth  [  5]  0.0-10.0 sec    502 MBytes    420 Mbits/sec  [ ID] Interval       Transfer     Bandwidth  [  4]  0.0-10.0 sec  1.05 GBytes    902 Mbits/sec    M -> s1  [ ID] Interval       Transfer     Bandwidth  [  4]  0.0-10.0 sec    637 MBytes    534 Mbits/sec  [ ID] Interval       Transfer     Bandwidth  [  5]  0.0-10.0 sec    925 MBytes    775 Mbits/sec  

vmstat for s2

 vmstat   procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------    r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st   1  0    268 126568 199100 22692944    0    0   100   836    8   81  1  0 96  3    vmstat 2 10  procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st  0  0    268 1150144 197128 21670808    0    0   100   835    9   81  1  0 96  3  0  0  0    268 1144464 197160 21674940    0    0   644  3096 1328 1602  0  0 97  2  0  0  2    268 1140680 197176 21679624    0    0   846  5362 1002 1567  0  0 98  2  0  0  1    268 1135332 197192 21685040    0    0   960  3348  850 1193  0  0 98  1  0  0  0    268 1130776 197204 21688752    0    0   576  2894  978 1232  0  0 98  2  0  0  0    268 1127060 197264 21693556    0    0   586  5202 1075 1505  0  0 97  3  0  0  0    268 1122184 197272 21698412    0    0   896  1160  614  727  0  0 98  1  0  0  0    268 1118532 197300 21702780    0    0   586  5070 1279 1708  0  0 93  6  0  0  0    268 1114000 197324 21705820    0    0   402  1522  947  942  0  0 95  4  0  0  0    268 1109708 197336 21710188    0    0   704  9150 1224 2109  0  0 97  2  0  

top output on s2

top - 14:44:25 up 16:36,  1 user,  load average: 1.62, 1.47, 1.42  Tasks: 140 total,   1 running, 139 sleeping,   0 stopped,   0 zombie  Cpu0  :  2.9%us,  1.1%sy,  0.0%ni, 73.8%id, 21.8%wa,  0.0%hi,  0.4%si,  0.0%st  Cpu1  :  0.8%us,  0.3%sy,  0.0%ni, 95.5%id,  3.3%wa,  0.0%hi,  0.0%si,      0.0%st  Cpu2  :  0.6%us,  0.3%sy,  0.0%ni, 97.7%id,  1.4%wa,  0.0%hi,  0.0%si,  0.0%st  Cpu3  :  0.5%us,  0.2%sy,  0.0%ni, 98.9%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st  Cpu4  :  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st  Cpu5  :  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st  Cpu6  :  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st  Cpu7  :  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st  Mem:  24744184k total, 24005508k used,   738676k free,   199136k buffers  Swap:  1050616k total,      268k used,  1050348k free, 22078920k cached  

Any ideas?

Is there any chance that the Mysql version is the culprit of all this in conjuction with the nearly 5 fold increase in traffic to the master ?

If that is the case then why s1 syncs and not s2?

Any ideas if 5.6.x solves similar probs ?

How does MySQL determine the 'index first key' and 'index last key' with indexes?

Posted: 01 Jun 2013 09:16 PM PDT

In other words, how does MySQL find the result range with indexes on the first step when it's executing a select query?

compare the same table

Posted: 01 Jun 2013 11:16 AM PDT

I am facing an issue with the following query. When I execute the query, it takes very long. I broke the query into two parts, compared with a shell script, but is there any chance to go with one query?

Any suggestion welcome.

select distinct substring(mobile_num,3,12)  from mobile  where  status ='INACTIVE'    and date(unsub_date) >= DATE(CURDATE() - INTERVAL 90 DAY)    and mobile_num not in(select distinct mobile_num from mobile where status='ACTIVE')  order by updtm;  
| mobile_num  | varchar(12)   | keyword     | varchar(45)   | sub_date    | datetime     | unsub_date  | datetime     | circle_name | varchar(45)   | type        | varchar(45)   | status      | varchar(45)  | operator    | varchar(45)   | act_mode    | varchar(45)   | deact_mode  | varchar(45)   | id          | bigint(20)    | updtm       | timestamp     

Show processlist / Max concurrent connections seem to max out at 131

Posted: 31 May 2013 11:16 PM PDT

When I put my database under a load test (basically just a basic write and basic read from our app), and run show processlist; in the middle of it, there is always 131 rows in the set and that's it. I'm running a master slave with 16GB of memory on each on a Joyent Percona instances which is based on Solaris 10. They are configured to each use up to 8GB memory and 5000 max concurrent connection. What could be limiting it to 131?

Deadlock free MERGE-alike in MySQL?

Posted: 01 Jun 2013 01:16 AM PDT

I am trying to implement some parts of MERGE in the MySQL driver in Drupal. Of course, Drupal has something but in truth it only works because the most frequent MERGE issuer just eats exceptions.

So, whatever we try, deadlocks occur. What we do, we start a transaction, then SELECT ... FOR UPDATE, try an INSERT and if it causes an 23xxx integrity error try an UPDATE instead. Deadlocks. We removed the FOR UPDATE cos we decided that for our use, it's OK. Still deadlocks.

I can't just switch isolation levels because READ COMMITTED needs row logging per http://dev.mysql.com/doc/refman/5.1/en/set-transaction.html

As of MySQL 5.1, if you use READ COMMITTED [...] you must use row-based binary logging.

And per http://stackoverflow.com/a/2032096/308851 READ UNCOMMITTED also needs row logging. And here comes http://dev.mysql.com/doc/refman/5.1/en/binary-log-setting.html

To change the global binlog_format value, you must have the SUPER privilege. This is also true for the session value as of MySQL 5.1.29.

I can't require every Drupal setup to have SUPER nor we can say that Drupal is incompatible with statement based binlogs when that's the default and the most widespread.

INSERT ... ON DUPLICATE KEY is neither versatile enough nor is it deadlock free http://bugs.mysql.com/bug.php?id=52020

So what now?

sql server database sharding - what to do with common data / non sharded data

Posted: 01 Jun 2013 06:16 PM PDT

We have a very large scale enterprise level database. As part of our business model all web users hit our web servers at the same time each month which in turn hammer our sql box. The traffic is very heavy and continues to grow heavier the larger the company grows. sql proc optimization has been performed and hardware has already been scaled up to a very high level.

We are looking to shard the database now to ensure that we can handle company growth and future loads.

We have decided what particular data should be sharded. It is a subset of our database which is highly utilized.

However, my question is regarding the non sharded data which is common/universal. An example of data like this may be an Inventory table for instance or possibly an Employee table, user table etc .

I see two options to handle this common/universal data:

1) design 1 - Place the common/universal data in an external database. All writes will occur here. This data will then be replicated down to each shard allowing each shard to read this data and inner join to this data in t-sql procs.

2) design 2 - Give each shard its own copy of all common/universal data. Let each shard write locally to these tables and utilize sql merge replication to update/sync this data on all other shards.

concerns about design #1

1) Transactional issues: If you have a situation in which you must write or update data in a shard and then write/update a common/universal table in 1 stored proc for instance, you will no longer be able to do this easily. The data now exists on seperate sql instances and databases. You may need to involve MS DTS to see if you can wrap these writes into a transaction since they are in a separate database. Performance is a concern here and possible rewrites may be involved for procs that write to sharded and common data.

2)a loss of referential integrity. Not possible to do cross database referential integrity.

3) Recoding large areas of the system so that it knows to write common data to the new universal database but read common data from the shards.

4). increased database trips. Like #1 above, when you run into a situation in which you must update sharded data and common data you are going to make multiple round trips to accomplish this since the data is now in separate databases. Some network latency here but I am not worried about this issue as much as the above 3.

concerns about design #2

In design #2 each shard gets its own instance of all common/universal data. This means that all code that joins to or updates common data continues to work/run just like it does today. There is very little recoding/rewriting needed from the development team. However, this design completely depends on merge replication to keep data in sync across all shards. the dbas are highly skilled and are very concerned that merge replication may not be able to handle this and should merge replication fail, that recovery from this failure is not great and could impact us very negatively.

I am curious to know if anyone has gone with design option #2. I am also curious to know if i am overlooking a 3rd or 4th design option that I do not see.

thank you in advance.

set the clock on mysql [closed]

Posted: 01 Jun 2013 09:16 AM PDT

I would like to know where does mysql get it current time. I am updating a table and using

update `table1` set `cola` = 1, `colb` =2 , `updated` = NOW();  

the time is 2 hours early. the time on the server is completly different, so I'm a little confused...

I tried

 SELECT @@global.time_zone, @@session.time_zone;  

but I get SYSTEM as result

Does pt-table-checksum point out rows that are different between Master and Slave?

Posted: 01 Jun 2013 08:57 PM PDT

Trying to verify replication integrity of my replica. And I found that pt-table-checksum is one of the best ways of doing it.

I have a simple Master->Slave set up. Have created a sample table with 10 rows on the Master, which has successfully replicated on the Slave. Intentionally on the slave I have changed a column in one of the rows of that table. Executed -

pt-table-checksum "h=host,u=user,p=password,P=3306" --function=MD5 --databases=test --tables=tentotal  

And indeed in my checksums table that pt-table-checksum created, I could see different checksum for master and slave for that table. Excellent so far!

Output was -

TS                  ERRORS DIFFS    ROWS    CHUNKS  SKIPPED  TIME  TABLE  09-05T12:17:30      0      0        3       1       0        0.182 test.tentotal  

Strangely the DIFF column says 0 which is incorrect. What am I missing here?

Percona tool kit version: 2.1.3

Database Design - Creating Multiple databases to avoid the headache of limit on table size

Posted: 01 Jun 2013 05:45 PM PDT

I need to design database to store information contained in millions of log files generated by devices on trial on network.

My first approach was storing all the information in tables contained in a single database. But this approach seems to fail as the data to be stored is too large. So, I thought of creating separate database for each device and storing each device log files in separate database. My question is if this approach can spare me of the headache involved with table size limit. Below are given my platform specification

  • Operating System - Windows 7
  • File System - NTFS
  • Storage Engine - InnoDB
  • Table Type - InnoDB

I want to build my database design in a way so that I need not to apply shrink of tables procedures in future

Is it OK to blindly add missing indexes?

Posted: 01 Jun 2013 07:52 AM PDT

I often use SSMS to test my slow stored procedures for missing indexes. Whenever I see a "Missing Index (Impact xxx)" my kneejerk reaction is to just create the new index. This results in a faster query every time as far as I can tell.

Any reason why I shouldn't continue to do this?

No comments:

Post a Comment

Search This Blog