Thursday, July 25, 2013

[how to] Advice on scripting a "whitewashed" and downsized development db from main db

[how to] Advice on scripting a "whitewashed" and downsized development db from main db


Advice on scripting a "whitewashed" and downsized development db from main db

Posted: 25 Jul 2013 04:26 PM PDT

I hope this the right place to ask this:

currently we (an active opensource project) have a large production db which contains all the site records (on postgresql). since we can't just allow every developer or contributor access to the db (containing emails, passwords, phone numbers etc') but we need to give the developers a somewhat up-to-date db (using sqlite) . we currently do the following process:

  1. db dump
  2. analyzing and changing the db with custom python scripts including truncating the table to make exporting to sqlite more bearable , remove sensitive data like passwords
  3. exporting this to sqlite dev db.

this is very slow and error prone.

The question: is there a recommended/best practice approach way to do this? both the whitewashing etc and the table trunacting (without breaking object relational mapping betweeen tables).

As I analyze the problem domain I see the main problem with the object related mapping. I can't just pull the first thousand records from all the tables (with LIMIT) since an object in line 900, for example, in one table may map a foreign key to the 1001 line in another table. I guess changing and sanitizing data can be done with views, replacing certain columns for a calculated one (based on the original table columnn). then the cron job could just dumb the view tables

I'll be glad for help/reference

Thanks!

Need help designing table with list of IDs to store

Posted: 25 Jul 2013 03:42 PM PDT

I have a table that I need to create to hold saved emails for FUTURE delivery (to, from, message, scheduled send date, etc). The catch here is that I don't know who's supposed to receive the email until the day of delivery. What I mean is, the email is created to go to certain selected organizations, but the "members" of the organization will be constantly joining and leaving, so if someone creates an email for delivery next month, the list of member email addresses to send to by then will be different. So, what I need to save in the table is the list of organizations that the email should go to, so that I can query for the latest list of member email addresses when I actually send the email. Hope that makes sense.

Anyway, so my question is, what is considered a "proper design" for this? My initial thought is to just save a comma delimited list of organization ids. I know I will never have to search on which organizations were on the list, so I don't care if it's not query-able. And I know I could normalize it into one row per recipient organization, but it seems such an unnecessary repeat of data for no purpose, especially since I only query on the SENDER not the recipients.

So is a list of Ids just a horrible, no good, only-a-newbie-would-think-of-that, bad thing? Or can it be used in some cases? Or is there some other way to do this that I don't know about? I'm sure I can't be the only one who's run into a situation like this before!

Thanks in advance for your help!

InnoDB: Error: pthread_create returned 12

Posted: 25 Jul 2013 04:00 PM PDT

have installed MySQL in a particular folder. I was able to run the server and create accounts, databases etc. However, now whenever I try to start the server, I get an error:

$ mysqld_safe --defaults-file=mysql.cnf &  [1] 2002  [compute-0-5 /amber2/scratch/myname/mysql]$ 130725 17:56:24 mysqld_safe Logging to '/amber2/scratch/myname/mysql/data/compute-0-5.local.err'.  130725 17:56:24 mysqld_safe Starting mysqld daemon with databases from /amber2/scratch/myname/mysql/data  130725 17:56:25 mysqld_safe mysqld from pid file /amber2/scratch/myname/mysql/data/compute-0-5.local.pid ended    [1]+  Done                    mysqld_safe --defaults-file=mysql.cnf  

In the error file inside the data folder:

130725 17:17:53 mysqld_safe Starting mysqld daemon with databases from /amber2/scratch/myname/mysql/data  2013-07-25 17:17:54 0 [Warning] option 'read_buffer_size': unsigned value 2147483648 adjusted to 2147479552  2013-07-25 17:17:54 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).  2013-07-25 17:17:54 28189 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)    2013-07-25 17:17:54 28189 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)    2013-07-25 17:17:54 28189 [Note] Plugin 'FEDERATED' is disabled.  2013-07-25 17:17:54 28189 [Note] InnoDB: The InnoDB memory heap is disabled  2013-07-25 17:17:54 28189 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins  2013-07-25 17:17:54 28189 [Note] InnoDB: Compressed tables use zlib 1.2.3  2013-07-25 17:17:54 28189 [Note] InnoDB: Using Linux native AIO  2013-07-25 17:17:54 28189 [Note] InnoDB: Not using CPU crc32 instructions  2013-07-25 17:17:54 28189 [Note] InnoDB: Initializing buffer pool, size = 128.0M  2013-07-25 17:17:54 28189 [Note] InnoDB: Completed initialization of buffer pool  2013-07-25 17:17:54 28189 [Note] InnoDB: Highest supported file format is Barracuda.  InnoDB: Error: pthread_create returned 12  130725 17:17:55 mysqld_safe mysqld from pid file /amber2/scratch/myname/mysql/data/compute-0-18.local.pid ended  

But why am I getting the pthread_create error 12? It seems this is related to not enough space. On the device where the mysql folder (/amnber2//scratch/myname/mysql) resides, I do have space:

$ df -h  Filesystem            Size  Used Avail Use% Mounted on  /dev/sda1              49G  5.0G   41G  11% /          ...  10.255.255.46:/export/scratch                         15T   11T  4.2T  72% /amber2/scratch  

I also have a few gigs in my home directory quota

$ quota -v  Disk quotas for user myname (uid 41222):       Filesystem  blocks   quota   limit   grace   files   quota   limit   grace  10.255.255.45:/export/ncms                  22986221  26214400 26214400               0       0       0  10.255.255.46:/export/scratch                  7321108       0       0               0       0       0  

i.e., I am using 22G out of 25G:

$ du -sh ~/.  22G /home/ncms/myname/.  

I also have free memory:

$ free -mg               total       used       free     shared    buffers     cached  Mem:            62         41         21          0          0         28  -/+ buffers/cache:         12         49  Swap:           64         13         51  

In my mysql configuration file:

myisam_sort_buffer_size=4G     myisam_max_sort_file_size=200G  read_buffer_size=2G  

So why am I getting the error pthread_create error while starting the server?

How to disable flashback query logging for a specific table (Oracle)?

Posted: 25 Jul 2013 03:01 PM PDT

We have a specific table that has a lot of activity and it creates a lot of change records. The consequence is that the flashback data only goes back a couple of days. That is OK for many cases but it would be beneficial to have access to more historical data.

We would like to either restrict logging on that table. Or disable it completely. I imagine that we may be able to do this by tablespace, I just have not found much on how to make these changes.

mongodb user for ubuntu EC2 instance

Posted: 25 Jul 2013 02:53 PM PDT

I am trying to install mongodb on Ubuntu EC2 instance. However, I am confused about what user the DB would run as:

If I follow: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/

Then it says : "mongodb"

If I follow : http://docs.mongodb.org/ecosystem/tutorial/install-mongodb-on-amazon-ec2/

It says : "mongod"

I think it lead me to inconsistent state. There was a process running is ps output for mongodb but

sudo service mongodb status or stop says: Unknown Instance.

What sld be user of mongodb?

Identifying Unused Stored Procedures

Posted: 25 Jul 2013 04:10 PM PDT

This next year, I helping an effort to clean several SQL Server environments. We have about 10,000 stored procedures and estimate that only about 1000 of them are used on a regular basis, and another 200 or so are used on a rare occasion, meaning we have a lot of work to do. Since we have multiple departments and teams that can access these databases and procedures, we are not always the ones calling the procedures - meaning that we must determine what procedures are being called. On top of that, we want to determine this over a few months, not in a few days (which eliminates some possibilities).

One approach to this is to use the SQL Server Profiler and track what procedures are being called and compare them to the list of what procedures we have, while marking whether the procedures are used or not. From then, we could move the procedures to a different schema in case a department comes screaming.

Is using the Profiler the most effective approach here? And/Or have any of you done something similar and found another way/better way to do this?

Replicate to a secondary database that also allows edits

Posted: 25 Jul 2013 07:12 PM PDT

I have a client with a somewhat odd requirement. They want to be able to take a snapshot of their sales database that their accountants can then use once it's disconnected from the live database. That makes sense. But the accountants also want to be able to make edits to historical data in the secondary database, and have those edits retained the next time they take a snapshot.

I'm at a loss for how to do this. I could enable change tracking, then go through the change-tracking tables and reapply their changes after recreating the secondary database, but that sounds like it would quickly get messy.

Could I possibly use log shipping for this? They tell me that the data they would be editing in the secondary, historical database is unlikely to be touched in the primary database. But if there have been changes to the secondary database, will I still be able to restore transaction logs?

I'm really pretty clueless on how to proceed... Any advice would be appreciated!

How to revoke DBADM and clean up any related objects from DB2 LUW

Posted: 25 Jul 2013 04:16 PM PDT

This question has started because of us taking copies of production backups and restoring them into lower environments (with scrambled data of course) for developers to practice and/or debug against.

We have IDs that are assigned DBADM,SECADM,DATAACCESS, and ACCESSCTRL (mainly the instance owner). When we restore into a lower environment, we end up needing to log on as the original instance owner and grant the same above authorities (DBADM, SECADM, DATAACCESS, and ACCESSCTRL) to the new target instance owner.

I figured it was not a good idea to leave the original ID in the database, so I attempted to revoke its privileges. Not long after my package rebinds started failing, as there are apparently packages tied to that original ID. Not knowing what they contained and if I could/should delete them or not (even though I'm thinking they are harmless to remove???) I ended up restoring the privileges to the original instance owner and just leaving it there.

That has always bothered me. And ID with the powers of DBADM,SECADM,DATAACCESS, and ACCESSCTRL shouldn't just be lying around. To me that is a security hole. To me the ID should be revoked and any other cleanup that needs to be performed should be executed to keep the database safe. The same would be true if say I would quit the company or my fellow DBAs would. We would want to revoke IDs and clean up and/or transfer ownership of objects to remove any security holes.

The problem is, I don't know what all needs to be cleaned up. I have done a bit of poking around on Google and IBM's documentation, and I can find nothing to suggest the steps that should normally be taken when removing such a high user from the system. Or even a user with BINDADD authority?

What do you all remove/revoke and in what order? Is there a way this can be scripted/automated? How do you know which packages you can remove? Are there other things that need to be moved/transferred?

Anyone else encounter this? Thoughts? Experiences?

MySQL SSL encryption

Posted: 25 Jul 2013 01:17 PM PDT

Does setting MASTER_SSL to 1 in change master ensure encryption without specifying options MASTER_SSL_CA, MASTER_SSL_CAPATH, MASTER_SSL_CERT, MASTER_SSL_KEY ?

After setting MASTER_SSL to 1, show slave status says Master_SSL_Allowed: yes, but does that ensure the transferred data will be encrypted?

 Master_SSL_Allowed: Yes   Master_SSL_CA_File:   Master_SSL_CA_Path:      Master_SSL_Cert:    Master_SSL_Cipher:       Master_SSL_Key:  

Thanks!

MySQL SSL encrytion query

Posted: 25 Jul 2013 08:01 PM PDT

Does Setting MASTER_SSL to 1 in change master ensure encryption without specifying options MASTER_SSL_CA, MASTER_SSL_CAPATH, MASTER_SSL_CERT MASTER_SSL_KEY.

After setting MASTER_SSL to 1 "show slave status" says Master_SSL_Allowed: yes, but does that ensure the transferred data will be encrypted?

     Master_SSL_Allowed: Yes       Master_SSL_CA_File:       Master_SSL_CA_Path:          Master_SSL_Cert:        Master_SSL_Cipher:           Master_SSL_Key:  

Thanks!

Return multiple rows from matching on multiple CASE matches?

Posted: 25 Jul 2013 01:47 PM PDT

I want to add a column to my query which will specify one or more categories a row matches. I want to take this:

    +--------------+---------------+  Row | Product      | Quantity_Sold |      +--------------+---------------+  1   | Coca-Cola    | 15            |  2   | Cigarettes   | 4             |  3   | Pretzel      | 6             |  4   | Beer         | 25            |  5   | Popcorn      | 10            |  6   | Candy Bar    | 10            |      +--------------+---------------+  

And return this:

    +--------------+---------------+----------------------+  Row | Product      | Quantity_Sold | Category             |      +--------------+---------------+----------------------+  1   | Coca-Cola    | 15            | Beverages            |  2   | Cigarettes   | 4             | Controlled Substance |  3   | Pretzel      | 6             | Snacks               |  4   | Beer         | 25            | Beverages            |  5   | Beer         | 25            | Controlled Substance |  6   | Popcorn      | 10            | Snacks               |  7   | Candy Bar    | 10            | Snacks               |      +--------------+---------------+----------------------+  

Notice on line 4-5 of the output, "Beer" is on two lines, because it fits in two categories.

If I try to do this with CASE, only the first match will be counted.

This query

SELECT      Product,      Quantity_Sold,      CASE          WHEN              Product IN ('Coca-Cola', 'Beer')          THEN              'Beverages'      CASE          WHEN              Product IN ('Pretzel', 'Popcorn', 'Candy Bar')          THEN              'Snacks'      CASE          WHEN              Product IN ('Cigarettes', 'Beer')          THEN              'Controlled Substance'      END          AS Category  FROM sales_table;  

Would only return this output

    +--------------+---------------+----------------------+  Row | Product      | Quantity_Sold | Category             |      +--------------+---------------+----------------------+  1   | Coca-Cola    | 15            | Beverages            |  2   | Cigarettes   | 4             | Controlled Substance |  3   | Pretzel      | 6             | Snacks               |  4   | Beer         | 25            | Beverages            |  5   | Popcorn      | 10            | Snacks               |  6   | Candy Bar    | 10            | Snacks               |      +--------------+---------------+----------------------+  

(Notice "Beer" only appears once)

So how can I get it to show up on separate lines for all categories it matches?

Replacing master.dbo.sysperfinfo with sys.dm_os_performance_counters

Posted: 25 Jul 2013 11:44 AM PDT

I'm working with some old code that pulls performance counters, and part of what I'm doing is making sure we're doing some processes efficiently and effectively.

Right now I have this code that I'm looking at:

select replace(rtrim(object_name),'SQLServer:','') as 'Object',rtrim(counter_name) as 'Counter',rtrim(instance_name) as 'Instance',cntr_value as Value from master.dbo.sysperfinfo where object_name <> 'SQLServer:User Settable'  

I'm proposing to replace it with:

select replace(rtrim(object_name),'SQLServer:','') as 'Object',rtrim(counter_name) as 'Counter',rtrim(instance_name) as 'Instance',cntr_value as Value from sys.dm_os_performance_counters where object_name <> 'SQLServer:User Settable'  

Which is faster and more compatible with everything past SQL 2000. There are no environments this will be used in that run SQL 2000 anymore, the earliest version the above code would be run using is SQL 2008. I've checked that the returned values are congruent and it works in the context of the rest of the code.

My question is: What are the differences between dm_os_performance_counters and master.dbo.sysperfinfo? Do I need to pull dm_os_performance_counters from the Master context?

MySQL Replication not proceeding

Posted: 25 Jul 2013 11:52 AM PDT

I have a weird replication problem I have not seen before. It's basic mysql replication with single master and two slaves. One of the slaves is not executing replication events and seems just stuck at some point yesterday. The other is current.

  • Running show slave status on the problematic slave shows no increases in any of the counters.
  • It lists both the IO and SQL threads as running.
  • Seconds behind reports 0.
  • None of the log counters are increasing.
  • No errors are reported. Running stop/start slave return no errors. Bouncing the server reports nothing out of the ordinary in the .err log and says it's picking up replication from the relay log position it's stuck on
  • The master shows the slave as connected
  • The slave shows two system user replication threads reporting "Waiting for master to send event" and "Slave has read all relay log; waiting for the slave I/O thread to update it". Their Time counters in the process list are just steadily increasing.
  • Attempts to connect from the slave w/ the replication credentials to the master via commandline client work fine
  • There is plenty of disk space in both the datadir and logdir
  • The Master_log_file it's reporting still exists on the master according to both show binary logs and looking at the actual filesystem (it wasn't pruned or manually deleted from the FS)
  • The master and both slaves are running the same percona build (5.5.29-30.0-log) and have been as such for many months.

I'm at a loss on what to further troubleshoot. Help?

Postgres 9.1.6 Error index contains unexpected zero page at block 0

Posted: 25 Jul 2013 11:03 AM PDT

I have setup streaming replication on Postgres 9.1.6 running on a debian server and it's going on fine.

When I try to run a query on the replica DB I get the error below:

ERROR:  index "tbl_cust_id_idx" contains unexpected zero page at block 0  HINT:  Please REINDEX it.  

What might be causing this error?

The question is also posted in http://stackoverflow.com/questions/17865135/postgres-9-1-6-error-index-contains-unexpected-zero-page-at-block-0

Postgres 9.1.6 Error: index contains unexpected zero page at block 0 [duplicate]

Posted: 25 Jul 2013 03:28 PM PDT

I have setup streaming replication on Postgres 9.1.6 running on a debian server and it's going on fine.

When I try to run a query on the replica DB I get the error below:

ERROR:  index "tbl_cust_id_idx" contains unexpected zero page at block 0  HINT:  Please REINDEX it.  

What might be causing this error?

PostgreSQL CREATE TABLE creates with incorrect owner

Posted: 25 Jul 2013 11:43 AM PDT

I'm using PostgreSQL 9.2.4. When I create a table as a non-superuser in a database owned by that non-superuser, it is owned by the postgres user, so I can't put any data into it unless I explicitly grant myself permission.

I created the non-superuser like this:

admin_user=# create role "test1" NOINHERIT LOGIN ENCRYPTED PASSWORD 'wibble' CONNECTION LIMIT 10;  

Then I created a database owned by test1 like this:

admin_user=# create database "test1$db1" with owner "test1";  

Then I started a new psql as user test1, and created a table:

test1$db1=> create table test_table (column_name varchar(50));  

But I couldn't insert into it:

test1$db1=> insert into test_table values ('some data');                                                                                                      ERROR:  permission denied for relation test_table  

Checking the permissions shows that the table is owned by the postgres user:

test1$db1=> \dt               List of relations   Schema |      Name      | Type  |  Owner     --------+----------------+-------+----------   public | test_table     | table | postgres  

However, I can grant myself permissions and do stuff:

test1$db1=> grant insert, select on test_table to test1;                                                                                                      GRANT  test1$db1=> insert into test_table values ('some data');   INSERT 0 1  test1$db1=> select * from test_table;   column_name   -------------   some data  (1 row)  

What's going on? I'm pretty sure this used to work. And the PostgreSQL docs for CREATE TABLE say

CREATE TABLE will create a new, initially empty table in the current database. The table will be owned by the user issuing the command.

Having to grant permissions to myself on my own tables doesn't sound like it's what I should have to do.

Any help much appreciated!

Replication Master/Master, both master stop by themself

Posted: 25 Jul 2013 11:36 AM PDT

I did a master/master replication with MySQL on a Gentoo OVH Release 2, all work fine in my test phase, i put it in production, the replication works fine in 1 or 2 days but this morning, i don't know why, my slave stops running !

So now the log position is bad => i can't just restart slave And my replication did'nt works :/

I want a master/master replication to make a backup server with an IP Fail Over, so only one server is writing/reading in the database at same time.

When i go to MySql and i click on "Show slave status", i see an error like : "Error 'Duplicate entry '411465' for key 1' on query. Default database .... etc"

Did this error stop my replication ? If yes, why do i get this error ? Because the second server does nothing on the database, so normally, there is no problem with the auto increment, right ?

If someone have an idea on how i can fix it, he's welcome :)

PS: Sorry if i made some grammatical mistakes in my question.

Avoiding a sort on an already clustered index for group by

Posted: 25 Jul 2013 05:21 PM PDT

On a table T with two fields, pid and did, the following query results in a seq. scan followed by a sort on pid:

select count(did), pid   from  T  group by pid  

Here is the query plan:

GroupAggregate  (cost=21566127.88..22326004.09 rows=987621 width=8)  ->  Sort  (cost=21566127.88..21816127.88 rows=100000000 width=8)      Sort Key: pid       ->  Seq Scan on tc  (cost=0.00..1442478.00 rows=100000000 width=8)  

However, the table already has clustered index on pid.

Why doesn't Postgres simply scan the table and compute the group by? Why does it need to sort on pid again?

How can I force Postgres to use the clustered index for the group by?

Foreign Key off of a UNIONed View

Posted: 25 Jul 2013 04:34 PM PDT

I have a view that looks like this:

CREATE view   reference.Test WITH SCHEMABINDING  as        SELECT        reference.HighlevelTestId as TestId, Name, IsActive,                    cast(1 as bit) as IsHighLevelTest       FROM          reference.HighlevelTest        UNION ALL        SELECT        LowLevelTestId  as TestId, Name, IsActive,                     cast(0 as bit) as IsHighLevelTest        FROM          reference.LowLevelTest    GO  

NOTE: HighLevelTestId and LowLevelTestId are guaranteed to never have the same values (no conflicts).

I have another table that looks like this:

CREATE TABLE [Reference].[TestAddition](      [TestId] [BigInt] NOT NULL,      [OtherStuff] [bit] NOT NULL,      ....   )   

I would really like to FK my TestAddition table to my Test view on the TestId column (for referential integrity and ease of use with OData).

Is there any way to do that? (Any way to change my view to make this work?)

Splitting a large SQL Server MDF file

Posted: 25 Jul 2013 05:18 PM PDT

I have a large (1.2 terabyte) SQL Server database that I need to migrate to a new server. Most of the database lives on a single, 1.25 TB data file, and a little bit sits on a much-more-manageable 550 GB file (which is practically empty).

Now, the tricky bit: the server to which I'm migrating only has 3 700 GB volumes, meaning I need to somehow dissect this goliath into three equal chunks. Most advice I've found involves creating 3 target files and running DBCC SHRINKFILE EMPTYFILE on my main file to empty it into the targets, but that'd take ages with a database this large.

Is there a recommended method for splitting a database this large? I'm considering using the Sql Server Integration Services Data Export feature to dump the data into a clone database with the proper file structure, but I'm curious as to whether there's a better way.

How to repair Microsoft.SqlServer.Types assembly

Posted: 25 Jul 2013 05:41 PM PDT

When I run a checkdb('mydb') this is the only error message printed.

Msg 8992, Level 16, State 1, Line 1  Check Catalog Msg 3857, State 1: The attribute (clr_name=NULL) is required but is missing for row (assembly_id=1) in sys.assemblies.  

It is referring to 'Microsoft.SqlServer.Types' I do see that in the this db the clr_name is blank. but under the master db there is a value in there.

I tried to drop or alter the assembly to add this value but its restricted.

btw, this db was updated lately from sql-server 2005 to 2008R2.

Unable to connect to Amazon RDS instance

Posted: 25 Jul 2013 01:40 PM PDT

I recently created an oracle instance on Amazon RDS. Unfortunately, I'm not able to connect to the instance using Oracle SQL Developer.

The (relevant) information I have from Amazon;

Endpoint - The DNS address of the DB Instance: xxx.yyy.eu-west-1.rds.amazonaws.com

DB Name - The definition of the term Database Name depends on the database engine in use. For the MySQL database engine, the Database Name is the name of a database hosted in your Amazon DB Instance. An Amazon DB Instance can host multiple databases. Databases hosted by the same DB Instance must have a unique name within that instance. For the Oracle database engine, Database Name is used to set the value of ORACLE_SID, which must be supplied when connecting to the Oracle RDS instance: ZZZ

Master Username - Name of master user for your DB Instance: org

Port - Port number on which the database accepts connections: 1521

From this information, the connection settings in SQL Developer are pretty obvious, so I don't really see what I could be missing...

Will Partitions and Indexes on the same table help in performace of Inserts and Selects?

Posted: 25 Jul 2013 04:41 PM PDT

I have a table containing the list of visitors and this table has the following information.

  • Visitor Browser Information
  • Visitor Location Information
  • Visitor Time Information
  • No of Visits

I have a second table that maintains the history of each visits, which means I if the same visitor visits the site, I insert into the second table and update the no. of visits on the first table.

The kind of reports that I have to generate for this table are

  1. Count of Visitors/day or days (Search Between days)
  2. Count of Visitors/month
  3. Count of Visitors/year
  4. Count of Visitors/browser or grouped by browsers

On an average there are about 20000 inserts to the second table and about 15000 inserts to the first table, meaning 5000 were updates to the first table (5000 repeat visits).

I need to decide between partitioning the tables by month and sub-partitioning by days for the reports 1,2,3 and index the browser related columns for report 4.

There will be more reports in the future not sure on what clauses.

Does partitioning/sub-partitioning along with indexing help in the performance of inserts and selects?

Should I perform partitioning on both the tables?

I am currently using MySQL 5.5 + InnoDB

SSRS Bar Chart Issue [on hold]

Posted: 25 Jul 2013 11:06 AM PDT

I have a Bar Chart that I am trying to limit Sales data to specific Months. I want a SUM of Sales $ for 2012 and a SUM of Sales $ for 2013. The Month Names are formatted like January 2012, January 2013, etc. How can I do this?

How should I model a binary-tree like data using Adjacency List?

Posted: 25 Jul 2013 02:06 PM PDT

I'm making a system where a user can recruit others, resulting in a binary tree model of users. I have decided to use Adjacency List to model the data but I have some doubts about the number of fields in a table.

I made the users table, there are 24 fields including the node fields of a binary tree:

parent_id, side(left child or right child of its parent), indicated_id(the guy who recruited the user may be different of its parent in the binary tree) + 21 fields(email, name, password, tokens, phone, mobile, etc)  

So my doubts are:

1)Should I make a "nodes" table to wrap the user attributes that are not related to the binary-tree like this:

parent_id, indicated_id, side, user_id(pointing to the users table)  

Or put everything in the users table? I think if I put everything in the users table I will avoid JOINS and the performance would be better. But if I wrap in a nodes table I will be able to query for a sub-tree without read unrelated fields and Join only the resulting query, so the perfomance lost would be negligible?

Besides that, is there any problem to have a table with 20-30 fields that justify the use of another table for a 1-1 relationship?

2) Is there any better way to model an Adjacency List?

How to remove column output in a for xml path query with a group by expression?

Posted: 25 Jul 2013 11:39 AM PDT

I forgot how to remove a column from being output in a FOR XML PATH query using a group by expression. I used it before but somehow I lost the article. In the below example. I do not wish to have idForSomething output in my result by I want to use it as condition for my inner query.

SELECT     idForSomething,      SUM(allSomething) AS [@sum],     (SELECT           innerSomething AS [@inner], innerSomething2 AS [@inner2]      FROM             someTable s2      WHERE            s2.innerSomething = s1.idForSomething      FOR XML PATH('innerlist'), TYPE)  FROM          someTable s1  WHERE         idForSomething = 1  GROUP BY       idForSomething  FOR XML PATH('listofsomethings')  

Added XML Body:

    <listofsomethings @sum="10">          <innerlist @inner="..." @inner2="..." />          <innerlist @inner="..." @inner2="..." />          <innerlist @inner="..." @inner2="..." />      </listofsomethings>  

I will look around again online, but I asking for the syntax to SQL Server to NOT USE "idForSomething" column in the final output. I thought it was something like NOOUTPUT but I can't remember and it does not work.

Inserting query result to another table hangs on "Copying to temp table on disk" on MySQL

Posted: 25 Jul 2013 02:40 PM PDT

I started the process of inserting returned results to another table. The query groups the rows in respect of indexed IDs. This causes 149,000,000 rows to be decreased to 460,000 rows.

The query includes 3 table INNER JOINs, with each table having about 20,000,000 rows.

Further information, the process completes in about 12 seconds for a test file which has 1000 input rows, and returns 703 rows.

I started the query earlier ### we don't know when earlier is ###, but it is still running in the state: "Copying to temp table on disk" after 38000 seconds (10 and a half hours).

I think there is a problem during the insertion process. What am I probably doing wrong here? If it helps, the operating system of the computer is Windows 7, it has 3 GB RAM, an Intel Core2Duo 2.27GHz processor. ### you forgot to tell us details on the hard drive. One partition in, one out, same disk, same partitions, etc ###

Here's my query as it currently reads:

INSERT INTO kdd.contents               (adid,                descriptionwords,                purchasedkeywordwords,                titlewords)   SELECT t.adid,          dt.tokensid,          pkt.tokensid,          tt.tokensid   FROM   kdd.training t         INNER JOIN kdd.purchasedkeywordid_tokensid pkt                 ON t.keywordid = pkt.purchasedkeywordid          INNER JOIN kdd.titleid_tokensid tt                 ON t.titleid = tt.titleid          INNER JOIN kdd.descriptionid_tokensid dt                 ON t.descriptionid = dt.descriptionid   GROUP  BY adid;   

Primary key type change not reflected in foreign keys with MySQL Workbench

Posted: 25 Jul 2013 03:41 PM PDT

I have a problem with MySQL Workbench and primary/foreign keys.

I have some tables with PKs involved in relationship with other tables. If I modify the type of the PK, the type of the FK doesn't automatically update to reflect the change.

Is there any solution? Do I have to manually modify all the relations?

No comments:

Post a Comment

Search This Blog