<?xml version="1.0" encoding="ISO-8859-1" ?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/"  xmlns:content="http://purl.org/rss/1.0/modules/content/">
			<channel>
			<title>The World of Drewsonian Code</title>
			<link>http://www.logicalimagination.com/drewsonian</link>
			<description>Drew Fierst, a senior programmer/analyst, shares valuable coding tips and tricks, application design insights, and relational database concepts.</description>
			<language>en-us</language>
			<pubDate>Wed, 10 Mar 2010 04:10:53 GMT</pubDate>
			<lastBuildDate>Mon, 23 Jan 2006 04:09:57 GMT</lastBuildDate>
			<generator>CF Blogger by DayDream Inc</generator>
			<docs>http://blogs.law.harvard.edu/tech/rss</docs>
			<managingEditor>drew@logicalimagination.com</managingEditor>
			<webMaster>drew@logicalimagination.com</webMaster>
			
			<item>
				<title>Manual Running Concatenation</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=F5742CAD-BCD4-2240-DD4FA8F01A4DC1A2</link>
		
		<description><![CDATA[In a previous entry, while discussing why I wanted to thank my most difficult customers, I mentioned a reporting technique called 'Manual Running Concatenation'. There seems to be enough interest in it to warrant the following discussion. It should be noted that my primary reporting tool is Crystal Reports (http://www.businessobjects.com), and the techniques described below were developed within that tool.<br /><br /><br />Crystal Reports has an interface that allows for the quick creation of running totals that is quite flexible in designating when the running total is incremented and when it is reset to zero, including the use of conditional formulas to specify when to execute either of those actions. One of its few drawbacks is that it is not possible to use one running total in these conditional formulas to determine when to increment or reset another running total.<br /><br /><br />As an example, let's consider a report of this month's sales information grouped by salesperson and sorted chronologically. Suppose we wish to know how many sales it took each salesperson to surpass $1,000 in sales. We can build a running total to add up the amount of each sale, resetting to zero for each salesperson. We'd like to be able to create another running total to count the orders that increments only if the first running total is less than or equal to one thousand, but such is not possible in Crystal Reports.<br /><br /><br />One of several ways to solve this issue involves the use of three formula fields, all of which make use of the same variable. The first formula field sets the value of the shared variable to zero, and gets placed in the salesperson group header so the variable is reset each time the report starts printing a new salesperson. This formula is usually made invisible, because we don't need to see it, as long as it does its job.<br /><br /><br />The second formula field considers the value of the running total field, and adds one to the value of the shared variable so long as the value of the running total field is less than or equal to one thousand. This field goes in the details section of the report, thus counting only the sales that contribute to the month's first thousand in sales for that salesperson. This formula also is usually made invisible, because we don't need to see it, as long as it does its job.<br /><br /><br />The last formula simply outputs the value of the variable, and is placed in the salesperson group footer section to display the number of sales it took for the salesperson to surpass $1,000. This approach is commonly called a 'Manual Running Total'.<br /><br /><br />I'm building a report that displays group therapy events, with a listing of attendees for each event. In the group header for each event, my client wants a comma-delimited list of the therapists hosting the event. And then I realize that the manual running total technique could be adapted to apply to text. It requires a string variable instead of a numeric variable, and the process of string concatenation instead of addition. I call it a 'Manual Running Concatenation'.<br /><br /><br />First Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors := "";<br /><br /><br />Second Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors;<br /><br />If instructors = "" then<br />	instructors := {Staff.LastName}<br />else<br />	instructors := instructors & ", " & {Staff.LastName};<br /><br /><br />Third Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors;<br /><br /><br />Am I the only one who thinks it odd that there is no native reporting functionality for strings similar to the running total? At least the formula technique above will do the trick. Hmmmm. . . . I wonder where else I can apply this?  'Manual Running ______']]></description>
<content:encoded><![CDATA[ In a previous entry, while discussing why I wanted to thank my most difficult customers, I mentioned a reporting technique called 'Manual Running Concatenation'. There seems to be enough interest in it to warrant the following discussion. It should be noted that my primary reporting tool is Crystal Reports (http://www.businessobjects.com), and the techniques described below were developed within that tool.<br /><br /><br />Crystal Reports has an interface that allows for the quick creation of running totals that is quite flexible in designating when the running total is incremented and when it is reset to zero, including the use of conditional formulas to specify when to execute either of those actions. One of its few drawbacks is that it is not possible to use one running total in these conditional formulas to determine when to increment or reset another running total.<br /><br /><br />As an example, let's consider a report of this month's sales information grouped by salesperson and sorted chronologically. Suppose we wish to know how many sales it took each salesperson to surpass $1,000 in sales. We can build a running total to add up the amount of each sale, resetting to zero for each salesperson. We'd like to be able to create another running total to count the orders that increments only if the first running total is less than or equal to one thousand, but such is not possible in Crystal Reports.<br /><br /><br />One of several ways to solve this issue involves the use of three formula fields, all of which make use of the same variable. The first formula field sets the value of the shared variable to zero, and gets placed in the salesperson group header so the variable is reset each time the report starts printing a new salesperson. This formula is usually made invisible, because we don't need to see it, as long as it does its job.<br /><br /><br />The second formula field considers the value of the running total field, and adds one to the value of the shared variable so long as the value of the running total field is less than or equal to one thousand. This field goes in the details section of the report, thus counting only the sales that contribute to the month's first thousand in sales for that salesperson. This formula also is usually made invisible, because we don't need to see it, as long as it does its job.<br /><br /><br />The last formula simply outputs the value of the variable, and is placed in the salesperson group footer section to display the number of sales it took for the salesperson to surpass $1,000. This approach is commonly called a 'Manual Running Total'.<br /><br /><br />I'm building a report that displays group therapy events, with a listing of attendees for each event. In the group header for each event, my client wants a comma-delimited list of the therapists hosting the event. And then I realize that the manual running total technique could be adapted to apply to text. It requires a string variable instead of a numeric variable, and the process of string concatenation instead of addition. I call it a 'Manual Running Concatenation'.<br /><br /><br />First Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors := "";<br /><br /><br />Second Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors;<br /><br />If instructors = "" then<br />	instructors := {Staff.LastName}<br />else<br />	instructors := instructors & ", " & {Staff.LastName};<br /><br /><br />Third Formula:<br />WhilePrintingRecords;<br />Global stringVar instructors;<br /><br /><br />Am I the only one who thinks it odd that there is no native reporting functionality for strings similar to the running total? At least the formula technique above will do the trick. Hmmmm. . . . I wonder where else I can apply this?  'Manual Running ______'  ]]> 
  </content:encoded>
				<category>Reporting</category>
				<pubDate>Mon, 23 Jan 2006 04:09:57 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=F5742CAD-BCD4-2240-DD4FA8F01A4DC1A2</guid>
			</item>
			
			<item>
				<title>The Trick To Creating Multiple Records</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=840B0E64-BCD4-2240-DC8E1BDE3D0BF06B</link>
		
		<description><![CDATA[It happens in practically every reporting class I teach: a student has a report that was working fine until s/he added a new table to the underlying query, and all of a sudden there are multiple copies of each record in the report! Imagine that, the report is broken!<br /><br /><br />I remember back when this phenomenon used to stump me. When I understood relational databases in greater depth, the mystery disappeared. If I have a data set consisting of customer records and then link in a table of addresses, it should not surprise me that there would be multiple records in the data set for each customer that has more than one address. Any time I have a table join between tables where there is the possibility of many records in one table (addresses) related to a given record in the other table (customers), this multiplication of records can happen.<br /><br /><br />Many times, this is what we want in the report; other times it is not. In those cases where the multiple records are not desired, it is customary to group the records based on the values in the table on the 'one' side of the join (customers, in the preceding example) and somehow summarize the fields in the table on the 'many' side of the join (addresses, in the preceding example) using any of the traditional summary operators such as 'sum', 'average', 'min', 'max', 'count', etc.<br /><br /><br />So we can multiply our records by adding in a related table that embodies a one-to-many relationship, and we can prevent the multiplication of records by grouping the data set. But what if we need to multiply the records when we have no related table that will do the trick?<br /><br /><br />I'm working on an insurance application that has some rather demanding reporting requirements. When an insurance certificate is cancelled, a letter must be sent out informing the insured parties of the cancellation. Each certificate could potentially require a different number of copies of the cancellation letter. This is easy to do; I just store the number of required copies in the certificate table, and use that value to tell the printer how many copies to print when creating the print job.<br /><br /><br />Recently, the customer decided that she wants to print the cancellation letters once a week, in a batch process. How do I print a batch of letters when one may require two copies and the next requires three copies and yet another requires one copy? After considerable deliberation, the solution I'm implementing requires an unusual type of join in the underlying SQL query.<br /><br /><br />In most links between tables, whether an INNER, LEFT OUTER, or RIGHT OUTER join, the records on both sides of the join are considered to be related only when the values in the linked fields are equal to each other. So an address record would be related to a customer record only if the CustomerID in the address table is equal to the CustomerID in the customer table.<br /><br /><br />However, there are several other methods for associating records in one table to those in another. They can be considered to be related when the values in one field are less than the values in the other field, or greater than, or even not equal to the values in the other field.<br /><br /><br />In order to create a record set for the cancellation letters that contains the correct number of copies of each record, I need to create a table that I call "tblMultiplier". This table consists of just one field, called "LinkField", an auto-increment primary key field. I generate 100 records in this table, so the values in "LinkField" range from 1 to 100 (hopefully no-one will require more than 100 copies). The pertinent field in the certificate table, "CancelCopies", gets linked to the multiplier table on a less-than-or-equal join.<br /><br /><br />SELECT xxxx FROM tblCertificate c INNER JOIN tblMultiplier m ON m.LinkField <= c.CancelCopies<br /><br /><br />In this way, each record in the multiplier table relates to each record in the certificate table as long as the value in "LinkField" is less than or equal to the value in "CancelCopies"; so if a record in the certificate table specifies a value of "3" in the CancelCopies field, it will match three different records from tblMultiplier, those with a LinkField value of 1, 2 and 3, and the batch printing of the report will print the correct number of copies for each certificate.]]></description>
<content:encoded><![CDATA[ It happens in practically every reporting class I teach: a student has a report that was working fine until s/he added a new table to the underlying query, and all of a sudden there are multiple copies of each record in the report! Imagine that, the report is broken!<br /><br /><br />I remember back when this phenomenon used to stump me. When I understood relational databases in greater depth, the mystery disappeared. If I have a data set consisting of customer records and then link in a table of addresses, it should not surprise me that there would be multiple records in the data set for each customer that has more than one address. Any time I have a table join between tables where there is the possibility of many records in one table (addresses) related to a given record in the other table (customers), this multiplication of records can happen.<br /><br /><br />Many times, this is what we want in the report; other times it is not. In those cases where the multiple records are not desired, it is customary to group the records based on the values in the table on the 'one' side of the join (customers, in the preceding example) and somehow summarize the fields in the table on the 'many' side of the join (addresses, in the preceding example) using any of the traditional summary operators such as 'sum', 'average', 'min', 'max', 'count', etc.<br /><br /><br />So we can multiply our records by adding in a related table that embodies a one-to-many relationship, and we can prevent the multiplication of records by grouping the data set. But what if we need to multiply the records when we have no related table that will do the trick?<br /><br /><br />I'm working on an insurance application that has some rather demanding reporting requirements. When an insurance certificate is cancelled, a letter must be sent out informing the insured parties of the cancellation. Each certificate could potentially require a different number of copies of the cancellation letter. This is easy to do; I just store the number of required copies in the certificate table, and use that value to tell the printer how many copies to print when creating the print job.<br /><br /><br />Recently, the customer decided that she wants to print the cancellation letters once a week, in a batch process. How do I print a batch of letters when one may require two copies and the next requires three copies and yet another requires one copy? After considerable deliberation, the solution I'm implementing requires an unusual type of join in the underlying SQL query.<br /><br /><br />In most links between tables, whether an INNER, LEFT OUTER, or RIGHT OUTER join, the records on both sides of the join are considered to be related only when the values in the linked fields are equal to each other. So an address record would be related to a customer record only if the CustomerID in the address table is equal to the CustomerID in the customer table.<br /><br /><br />However, there are several other methods for associating records in one table to those in another. They can be considered to be related when the values in one field are less than the values in the other field, or greater than, or even not equal to the values in the other field.<br /><br /><br />In order to create a record set for the cancellation letters that contains the correct number of copies of each record, I need to create a table that I call "tblMultiplier". This table consists of just one field, called "LinkField", an auto-increment primary key field. I generate 100 records in this table, so the values in "LinkField" range from 1 to 100 (hopefully no-one will require more than 100 copies). The pertinent field in the certificate table, "CancelCopies", gets linked to the multiplier table on a less-than-or-equal join.<br /><br /><br />SELECT xxxx FROM tblCertificate c INNER JOIN tblMultiplier m ON m.LinkField <= c.CancelCopies<br /><br /><br />In this way, each record in the multiplier table relates to each record in the certificate table as long as the value in "LinkField" is less than or equal to the value in "CancelCopies"; so if a record in the certificate table specifies a value of "3" in the CancelCopies field, it will match three different records from tblMultiplier, those with a LinkField value of 1, 2 and 3, and the batch printing of the report will print the correct number of copies for each certificate.  ]]> 
  </content:encoded>
				<category>SQL</category>
				<pubDate>Sun, 01 Jan 2006 03:38:03 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=840B0E64-BCD4-2240-DC8E1BDE3D0BF06B</guid>
			</item>
			
			<item>
				<title>Boo Hoo or Woo Hoo?</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=31AD65AC-BCD4-2240-D00976D161AD7528</link>
		
		<description><![CDATA["Reasonable people adapt themselves to the world. Unreasonable people attempt to adapt the world to themselves. All progress, therefore, depends on unreasonable people." - George Bernard Shaw<br /><br /><br />It's one of the few constants in today's changing world. Wherever two or more software developers gather together, the war stories commence, with each participant vying for position, each relating a more outrageous story than the last about the trials of working with demanding clients. Each persecuted coder takes perverse pleasure in relaying the details as if it's a badge of honor to be so downtrodden.<br /><br /><br />I, too, have felt crushed under the completely unreasonable demands of clients. Do they not realize that my first attempts at developing their applications are perfect as they stand? So, because I have a list of client demands that have driven me crazy over the years, I will occasionally share these tribulations with other developers, just to see how I measure up on the martyrdom scale.  <br /><br /><br />However, I am really masquerading undercover during these developer bull-sessions. It is my opinion that we should thank our most difficult customers. Really.<br /><br /><br />To illustrate my point, consider one of my clients who needs a report of events that each have multiple attendees and multiple moderators. He finds it acceptable to display a vertical list of attendees, but he absolutely must have a comma-delimited list of moderators in the header for each event. In creating this solution, I developed a reporting technique I refer to as 'manual running concatenation' (watch for details on this technique in a future blog).<br /><br /><br />Another client keeps multiplying the ways she wants to bill her clients. In her efforts to remain competitive in the marketplace, she wants to provide for every conceivable possibility in structuring the charges to her clients. Difficult, to say the least. But guess what? I have achieved a greater proficiency with database stored procedures and triggers than I ever imagined.<br /><br /><br />Yet another client keeps changing his mind regarding both the presentation of the data and the features of the application. Items that are added one week are likely to be removed the next, with a fair chance that they will be added yet again a few weeks later. He has taught me the true value of commenting my code and keeping backups of prior application versions.<br /><br /><br />These difficult clients constantly push me to become a better programmer. They force me to transcend the same old programming techniques that I have worn out over the years. Without them, I might fall into the trap of continually using the same source code, changing the labels and recompiling, ad infinitum.<br /><br /><br />Sure, the most unreasonable customers are a source of tremendous work and numerous migraines.  Yet they are also a major driving force behind the development of new skills, techniques and technologies. So if you were to be totally honest with yourself, what have you learned from your most demanding clients?]]></description>
<content:encoded><![CDATA[ "Reasonable people adapt themselves to the world. Unreasonable people attempt to adapt the world to themselves. All progress, therefore, depends on unreasonable people." - George Bernard Shaw<br /><br /><br />It's one of the few constants in today's changing world. Wherever two or more software developers gather together, the war stories commence, with each participant vying for position, each relating a more outrageous story than the last about the trials of working with demanding clients. Each persecuted coder takes perverse pleasure in relaying the details as if it's a badge of honor to be so downtrodden.<br /><br /><br />I, too, have felt crushed under the completely unreasonable demands of clients. Do they not realize that my first attempts at developing their applications are perfect as they stand? So, because I have a list of client demands that have driven me crazy over the years, I will occasionally share these tribulations with other developers, just to see how I measure up on the martyrdom scale.  <br /><br /><br />However, I am really masquerading undercover during these developer bull-sessions. It is my opinion that we should thank our most difficult customers. Really.<br /><br /><br />To illustrate my point, consider one of my clients who needs a report of events that each have multiple attendees and multiple moderators. He finds it acceptable to display a vertical list of attendees, but he absolutely must have a comma-delimited list of moderators in the header for each event. In creating this solution, I developed a reporting technique I refer to as 'manual running concatenation' (watch for details on this technique in a future blog).<br /><br /><br />Another client keeps multiplying the ways she wants to bill her clients. In her efforts to remain competitive in the marketplace, she wants to provide for every conceivable possibility in structuring the charges to her clients. Difficult, to say the least. But guess what? I have achieved a greater proficiency with database stored procedures and triggers than I ever imagined.<br /><br /><br />Yet another client keeps changing his mind regarding both the presentation of the data and the features of the application. Items that are added one week are likely to be removed the next, with a fair chance that they will be added yet again a few weeks later. He has taught me the true value of commenting my code and keeping backups of prior application versions.<br /><br /><br />These difficult clients constantly push me to become a better programmer. They force me to transcend the same old programming techniques that I have worn out over the years. Without them, I might fall into the trap of continually using the same source code, changing the labels and recompiling, ad infinitum.<br /><br /><br />Sure, the most unreasonable customers are a source of tremendous work and numerous migraines.  Yet they are also a major driving force behind the development of new skills, techniques and technologies. So if you were to be totally honest with yourself, what have you learned from your most demanding clients?  ]]> 
  </content:encoded>
				<category>Programming</category>
				<pubDate>Fri, 16 Dec 2005 03:46:53 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=31AD65AC-BCD4-2240-D00976D161AD7528</guid>
			</item>
			
			<item>
				<title>The disappearing LEFT OUTER JOIN</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=1C61DFC1-BCD4-2240-DD375DB7CCAA4347</link>
		
		<description><![CDATA["What happened to my LEFT OUTER JOIN? My filter criteria destroyed the LEFT OUTERness of my SQL join."<br /><br /><br />NOTE: the application of the techniques described herein requires the use of ANSI/92 or newer standards with regards to the SQL language.<br /><br /><br />In the early part of my career in application development, I would often create a SQL statement with a LEFT OUTER join to pull information from a couple of tables in the database. I would then apply a selection criteria in the WHERE clause, and all of a sudden the data being returned would behave as if I had used an INNER join even though I could plainly see 'LEFT OUTER' in the SQL statement.<br /><br /><br /><div align='center'><img src='images/Relation1.jpg' border='0' /></div><br /><br /><br />An example may help make things clear at this point. Suppose we have a table of contacts (tblContacts) and a table of phone numbers (tblPhones). Each phone has a type (e.g. "home", "office", "cell", "pager", etc.). If I wanted a list of all customers and their cell phone numbers, I would create a LEFT OUTER join from tblCustomers to tblPhones, and then since I only wanted cell phones I would apply a criteria something like "WHERE tblPhones.PhoneTypeID = 3".<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, tblPhones.PhoneNumber<br />FROM tblContacts LEFT OUTER JOIN tblPhones ON tblContacts.ContactID = tblPhones.ContactID<br />WHERE tblPhones.PhoneTypeID = 3<br /><br /><br />All of a sudden, I would only see customers that had cell phone numbers, just as if I had specified an INNER JOIN between the two tables. What was happening? Well, the left outer join does indeed create a recordset including all contacts, regardless of whether they have phone numbers or not, and then the filter criteria is applied to that set of records. But the explicit criteria "PhoneTypeID=3" also carries with it an implicit criteria; basically, it implies that PhoneTypeID is not NULL. Thus the filter criteria eliminates any records where the PhoneTypeID is null, i.e. contacts without phone numbers.<br /><br /><br />So we're tempted to add a second criteria: change it to "WHERE PhoneTypeID = 3 OR PhoneTypeID IS NULL". This would work fine, if the only phone types were cell phones. However, consider a contact that has a pager and a home phone. In the set of records built by the joining of the two tables, there would be two records for this contact, one with a pager number and one with a home phone number. Neither record for this contact would have a phone type of "cell" or NULL, so this contact would still not be included in the filtered set of records.<br /><br /><br />The standard approach to solving this issue is to use a subquery, sometimes called a sub-select, to accomplish the filtering of phone numbers, and joining to the results of this subquery. (If the use of a set of filtered phone numbers would be frequently useful, sometimes a view will be built in the database to accomplish the filtering, and we could link to that instead of the subquery.)<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, cells.PhoneNumber <br />FROM tblContacts LEFT OUTER JOIN <br />(SELECT ContactID, PhoneNumber FROM tblPhones WHERE PhoneTypeID = 3) cells<br />ON tblContacts.ContactID = cells.ContactID<br /><br /><br />This will correctly return the set of all contacts, regardless of the presence or absence of phones for them, along with the cell phones for those contacts that have them. However, this approach can be somewhat confusing to read, and perhaps (depending on the database system, the table design, the number/size of records and dozens of other factors) a little slower than the alternative I'll describe in a moment. The subquery approach, in most cases, requires that the database perform one query and store that resultset temporarily and then link the other table to the temporary resultset. <br /><br /><br />Instead, the entire process can usually be performed in one step by specifying the filter criteria in the link. <br /><br /><br />We (those of us who use the ANSI/92 SQL joins) are so used to specifying a join where 'tableA.field = tableB.field' that we often forget that the join syntax is vastly more flexible. We can specify multiple fields between the tables, we can specify any relational operator, not just the equals sign, and we can even specify a constant in place of one of the field names. This last part is where we can apply a filter criteria in the link.<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, tblPhones.PhoneNumber<br />FROM tblContacts LEFT OUTER JOIN tblPhones<br />ON tblContacts.ContactID = tblPhones.ContactID AND tblPhones.PhoneTypeID = 3<br /><br /><br />In this SQL statement, the "PhoneTypeID = 3" is applied to the join criteria, and tblPhones is filtered before the data is joined, so the resultset will include all contacts, regardless of any phone information, and only the cell phone data. Most database systems can accomplish this without the creation of temporary tables, so the performance is frequently better than when using subqueries. It also has the added benefit of being easier to read, at least once we become familiar with the technique.]]></description>
<content:encoded><![CDATA[ "What happened to my LEFT OUTER JOIN? My filter criteria destroyed the LEFT OUTERness of my SQL join."<br /><br /><br />NOTE: the application of the techniques described herein requires the use of ANSI/92 or newer standards with regards to the SQL language.<br /><br /><br />In the early part of my career in application development, I would often create a SQL statement with a LEFT OUTER join to pull information from a couple of tables in the database. I would then apply a selection criteria in the WHERE clause, and all of a sudden the data being returned would behave as if I had used an INNER join even though I could plainly see 'LEFT OUTER' in the SQL statement.<br /><br /><br /><div align='center'><img src='images/Relation1.jpg' border='0' /></div><br /><br /><br />An example may help make things clear at this point. Suppose we have a table of contacts (tblContacts) and a table of phone numbers (tblPhones). Each phone has a type (e.g. "home", "office", "cell", "pager", etc.). If I wanted a list of all customers and their cell phone numbers, I would create a LEFT OUTER join from tblCustomers to tblPhones, and then since I only wanted cell phones I would apply a criteria something like "WHERE tblPhones.PhoneTypeID = 3".<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, tblPhones.PhoneNumber<br />FROM tblContacts LEFT OUTER JOIN tblPhones ON tblContacts.ContactID = tblPhones.ContactID<br />WHERE tblPhones.PhoneTypeID = 3<br /><br /><br />All of a sudden, I would only see customers that had cell phone numbers, just as if I had specified an INNER JOIN between the two tables. What was happening? Well, the left outer join does indeed create a recordset including all contacts, regardless of whether they have phone numbers or not, and then the filter criteria is applied to that set of records. But the explicit criteria "PhoneTypeID=3" also carries with it an implicit criteria; basically, it implies that PhoneTypeID is not NULL. Thus the filter criteria eliminates any records where the PhoneTypeID is null, i.e. contacts without phone numbers.<br /><br /><br />So we're tempted to add a second criteria: change it to "WHERE PhoneTypeID = 3 OR PhoneTypeID IS NULL". This would work fine, if the only phone types were cell phones. However, consider a contact that has a pager and a home phone. In the set of records built by the joining of the two tables, there would be two records for this contact, one with a pager number and one with a home phone number. Neither record for this contact would have a phone type of "cell" or NULL, so this contact would still not be included in the filtered set of records.<br /><br /><br />The standard approach to solving this issue is to use a subquery, sometimes called a sub-select, to accomplish the filtering of phone numbers, and joining to the results of this subquery. (If the use of a set of filtered phone numbers would be frequently useful, sometimes a view will be built in the database to accomplish the filtering, and we could link to that instead of the subquery.)<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, cells.PhoneNumber <br />FROM tblContacts LEFT OUTER JOIN <br />(SELECT ContactID, PhoneNumber FROM tblPhones WHERE PhoneTypeID = 3) cells<br />ON tblContacts.ContactID = cells.ContactID<br /><br /><br />This will correctly return the set of all contacts, regardless of the presence or absence of phones for them, along with the cell phones for those contacts that have them. However, this approach can be somewhat confusing to read, and perhaps (depending on the database system, the table design, the number/size of records and dozens of other factors) a little slower than the alternative I'll describe in a moment. The subquery approach, in most cases, requires that the database perform one query and store that resultset temporarily and then link the other table to the temporary resultset. <br /><br /><br />Instead, the entire process can usually be performed in one step by specifying the filter criteria in the link. <br /><br /><br />We (those of us who use the ANSI/92 SQL joins) are so used to specifying a join where 'tableA.field = tableB.field' that we often forget that the join syntax is vastly more flexible. We can specify multiple fields between the tables, we can specify any relational operator, not just the equals sign, and we can even specify a constant in place of one of the field names. This last part is where we can apply a filter criteria in the link.<br /><br /><br />SELECT tblContacts.FirstName, tblContacts.LastName, tblPhones.PhoneNumber<br />FROM tblContacts LEFT OUTER JOIN tblPhones<br />ON tblContacts.ContactID = tblPhones.ContactID AND tblPhones.PhoneTypeID = 3<br /><br /><br />In this SQL statement, the "PhoneTypeID = 3" is applied to the join criteria, and tblPhones is filtered before the data is joined, so the resultset will include all contacts, regardless of any phone information, and only the cell phone data. Most database systems can accomplish this without the creation of temporary tables, so the performance is frequently better than when using subqueries. It also has the added benefit of being easier to read, at least once we become familiar with the technique.  ]]> 
  </content:encoded>
				<category>SQL</category>
				<pubDate>Mon, 12 Dec 2005 00:32:22 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=1C61DFC1-BCD4-2240-DD375DB7CCAA4347</guid>
			</item>
			
			<item>
				<title>Everything&apos;s an Entity</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=C5A7B427-BCD4-2240-D5EE6590C9B151E1</link>
		
		<description><![CDATA[In today's complex business world, relationships between various companies and sundry people are no longer as clear as they might have previously been.  Someone that is a supplier for you in one capacity may also be a client in another capacity, and a competitor in yet another capacity.<br /><br /><br />The website <a href='http://www.worldwidewords.org' target='new'>www.worldwidewords.org</a> credits Ray Noorda, the founder of Novell, with the creation of the term "coopetition" to describe relationships of this type.  Several years ago, in developing an internal application for my employer, I needed a name for companies that had relationships of this type.  My development team came up with the term "competipeer", and it's been in use at our company ever since.<br /><br /><br />In designing a database, it is a standard practice to store common information for different objects in the same table.  For example, in an application that has to track information about students and instructors, there may be a "Person" table that contains the data points that all people have, regardless of whether they are a student or an instructor.  Then there would also be a "Student" table and an "Instructor" table, each containing only the data points that differentiate these objects from just any person, as well as the primary key of the Person record for the Student or Instructor in question.<br /><br /><br />This approach really pays off when an instructor for one class decides to become a student in another class.  Since that person's Instructor record and Student record contain the same primary key value from the Person table, we have captured the fact that this student is the same person as that instructor.  We also save some time and effort on data entry, as the core person information is already entered.<br /><br /><br />A couple of years ago, I was involved in the development of an application for an insurance company.  One of the earliest challenges was to correctly represent the relationships between the various business data required.  Just one example: an insurance certificate can be issued to a person or to a company.  How should that be represented in the database?<br /><br /><br />Our first thoughts were to provide fields in the Certificate table for both a CompanyID and a PersonID and to populate only one of them depending on whether a person was insured or a company.  My wife, who is a web- and application-developer, is no stranger to database design.  She confirmed that our reservations were well-founded and that this approach was not relationally sound.  We really didn't want two fields where only one would be used on any given record.  She recognized that we could use the common-table approach even when there was no data in common between the objects in question.<br /><br /><br />We ended up creating an "Entity" table with nothing but a primary key: an "EntityID".  Each company would be an Entity and get an EntityID.  Each person would be an Entity and get an EntityID.  The insured party for an insurance certificate would be an EntityID.  For data retrieval, we would build a view in the database that would combine all companies (and their EntityIDs) with all people (and their EntityIDs).  The SQL queries would then link the Certificate table to this view (on the EntityID field) to retrieve each insured party's name, regardless of whether the insured party was a company or a person.<br /><br /><br />Since then, virtually every project I've been involved with has provided some opportunity to use the concept of an EntityID to combine disparate objects that need to fill the same role in relation to another object.]]></description>
<content:encoded><![CDATA[ In today's complex business world, relationships between various companies and sundry people are no longer as clear as they might have previously been.  Someone that is a supplier for you in one capacity may also be a client in another capacity, and a competitor in yet another capacity.<br /><br /><br />The website <a href='http://www.worldwidewords.org' target='new'>www.worldwidewords.org</a> credits Ray Noorda, the founder of Novell, with the creation of the term "coopetition" to describe relationships of this type.  Several years ago, in developing an internal application for my employer, I needed a name for companies that had relationships of this type.  My development team came up with the term "competipeer", and it's been in use at our company ever since.<br /><br /><br />In designing a database, it is a standard practice to store common information for different objects in the same table.  For example, in an application that has to track information about students and instructors, there may be a "Person" table that contains the data points that all people have, regardless of whether they are a student or an instructor.  Then there would also be a "Student" table and an "Instructor" table, each containing only the data points that differentiate these objects from just any person, as well as the primary key of the Person record for the Student or Instructor in question.<br /><br /><br />This approach really pays off when an instructor for one class decides to become a student in another class.  Since that person's Instructor record and Student record contain the same primary key value from the Person table, we have captured the fact that this student is the same person as that instructor.  We also save some time and effort on data entry, as the core person information is already entered.<br /><br /><br />A couple of years ago, I was involved in the development of an application for an insurance company.  One of the earliest challenges was to correctly represent the relationships between the various business data required.  Just one example: an insurance certificate can be issued to a person or to a company.  How should that be represented in the database?<br /><br /><br />Our first thoughts were to provide fields in the Certificate table for both a CompanyID and a PersonID and to populate only one of them depending on whether a person was insured or a company.  My wife, who is a web- and application-developer, is no stranger to database design.  She confirmed that our reservations were well-founded and that this approach was not relationally sound.  We really didn't want two fields where only one would be used on any given record.  She recognized that we could use the common-table approach even when there was no data in common between the objects in question.<br /><br /><br />We ended up creating an "Entity" table with nothing but a primary key: an "EntityID".  Each company would be an Entity and get an EntityID.  Each person would be an Entity and get an EntityID.  The insured party for an insurance certificate would be an EntityID.  For data retrieval, we would build a view in the database that would combine all companies (and their EntityIDs) with all people (and their EntityIDs).  The SQL queries would then link the Certificate table to this view (on the EntityID field) to retrieve each insured party's name, regardless of whether the insured party was a company or a person.<br /><br /><br />Since then, virtually every project I've been involved with has provided some opportunity to use the concept of an EntityID to combine disparate objects that need to fill the same role in relation to another object.  ]]> 
  </content:encoded>
				<category>Database</category>
				<pubDate>Fri, 25 Nov 2005 04:21:41 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=C5A7B427-BCD4-2240-D5EE6590C9B151E1</guid>
			</item>
			
			<item>
				<title>Just what is &apos;raw&apos; data and what is &apos;calculated&apos;?</title>
				<link>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=C4990E05-BCD4-2240-D28DC9E99F16A2AD</link>
		
		<description><![CDATA[Viewing a question from the proper perspective often makes the answer clear.<br /><br /><br />I'm building an application that needs to be able to specify the profit calculations in two different ways.  Once the subtotal of all of the costs is generated, users need to be able to either input a percentage markup over the subtotal to generate the selling price (PRICE = COST * (1 + MARKUP)), or input a percentage margin of profit that the final selling price will have (PRICE = COST / (1 - MARGIN)).  Users need the flexibility to revisit the calculations at any time and enter either of these values and re-calculate.<br /><br /><br />The first dilemma surfaces regarding which values to store in the database.  Obviously, the individual components of the subtotal are stored, but not the subtotal.  However, in regards to the final selling price (the 'grand total', if you will) there are three interdependent values (the final selling price, the margin and the markup), any one of which could be stored and the remaining two could be calculated.<br /><br /><br />The standard guidelines of database design instruct us to avoid storing calculated values, so it may seem natural to store the markup percentage, and calculate the grand total (and the margin percentage) when necessary; at least, that was my first thought.  But, then what about storing the margin and calculating the other two?  There seems to be no reason to favor margin over markup or vice-versa.<br /><br /><br />Upon further reflection, I would argue against either of these choices.  One issue arises from rounding concerns.  The final selling price often needs to be rounded off to the nearest penny (or dollar, even), which could slightly affect the actual markup percentage.  Another issue concerns the philosophical concepts of right and wrong.  Can we say that there are right and wrong choices in this matter?  I think we can.<br /><br /><br />My point of view regarding database design has always been that we want to capture reality in the structure of our databases.  I always push the customer towards the question "What is the real relationship between these objects in the context of your business?" to determine the required structure of their database.  The point at which details become irrelevant to their business will determine the limits of the portion of reality that we wish to mirror in their database.<br /><br /><br />When applying that point of view to our question of totals and margins, it would seem to me that the most "real" quantity of the three is the selling price.  In the actual transaction between the seller and buyer, the item costs "X" to produce, and it sells for "Y".  Margins and markups are really summary values derived from those basic facts.  Therefore, I've decided to store the final selling price in the database and to calculate the margins and markups when needed.<br /><br /><br />The second dilemma arises when actually trying to implement this approach in the user interface.  The users are going to be entering one of the calculated values, and the application will then back-calculate the "raw" data.  So how does this all work out in an object-oriented application?<br /><br /><br />Let's start with a Widget object that has costs for raw materials, production and finishing.  Each of those costs would be stored in the Widget table of the database along with its selling price.  The Widget object would have three read/write properties for raw materials, production and finishing costs along with a read-only property for a subtotal of those costs.  The selling price would be a read/write property, but it would only be written when loading up from the database; during the rest of the life of the Widget object, the selling price would be manipulated by assigning the Widget a new margin or markup.  (Depending on the development language, this might be enforced by the design of the Widget object and the scope of its member variables and properties; otherwise, it must rely on the restraint of the programmer to avoid directly assigning a selling price in the UI layer.) <br /><br /><br />The markup and margin percentages would be read/write properties, but there would be no local variable inside of the Widget object tracking these values.  Instead, they would be directly calculated (from the subtotal and selling price) whenever they need to be retrieved; whenever either of them is assigned, the Widget object would merely re-calculate the selling price and store that locally.<br /><br /><br />In an event-driven, conventional application, the user's changes to the onscreen values of the margin and markup would be immediately written to the in-memory Widget object, and the selling price would then be immediately updated onscreen.<br /><br /><br />In an asynchronous (e.g. web) application, it would involve just a bit of extra work in the user-interface layer.  Suppose a Widget object is loaded in memory and displayed onscreen.  There would be a place where the user could enter a new margin and a place where the user could enter a new markup.  When saving the user's changes, we wouldn't want to blindly write the onscreen values for both properties back into our Widget object.  If the user has changed the first one, but not the second, and we write both values back onto the Widget object, the user's changes (the first value) would get overwritten by the original values (the second one) with the net effect of no change at all to the selling price.  Only the changed value(s) would get written back onto the Widget object.<br /><br /><br />When storing into the database, only the values for the three costs and the final selling price would get written to the table, reflecting the core reality of this business transaction.  All the rest is summary and analysis.]]></description>
<content:encoded><![CDATA[ Viewing a question from the proper perspective often makes the answer clear.<br /><br /><br />I'm building an application that needs to be able to specify the profit calculations in two different ways.  Once the subtotal of all of the costs is generated, users need to be able to either input a percentage markup over the subtotal to generate the selling price (PRICE = COST * (1 + MARKUP)), or input a percentage margin of profit that the final selling price will have (PRICE = COST / (1 - MARGIN)).  Users need the flexibility to revisit the calculations at any time and enter either of these values and re-calculate.<br /><br /><br />The first dilemma surfaces regarding which values to store in the database.  Obviously, the individual components of the subtotal are stored, but not the subtotal.  However, in regards to the final selling price (the 'grand total', if you will) there are three interdependent values (the final selling price, the margin and the markup), any one of which could be stored and the remaining two could be calculated.<br /><br /><br />The standard guidelines of database design instruct us to avoid storing calculated values, so it may seem natural to store the markup percentage, and calculate the grand total (and the margin percentage) when necessary; at least, that was my first thought.  But, then what about storing the margin and calculating the other two?  There seems to be no reason to favor margin over markup or vice-versa.<br /><br /><br />Upon further reflection, I would argue against either of these choices.  One issue arises from rounding concerns.  The final selling price often needs to be rounded off to the nearest penny (or dollar, even), which could slightly affect the actual markup percentage.  Another issue concerns the philosophical concepts of right and wrong.  Can we say that there are right and wrong choices in this matter?  I think we can.<br /><br /><br />My point of view regarding database design has always been that we want to capture reality in the structure of our databases.  I always push the customer towards the question "What is the real relationship between these objects in the context of your business?" to determine the required structure of their database.  The point at which details become irrelevant to their business will determine the limits of the portion of reality that we wish to mirror in their database.<br /><br /><br />When applying that point of view to our question of totals and margins, it would seem to me that the most "real" quantity of the three is the selling price.  In the actual transaction between the seller and buyer, the item costs "X" to produce, and it sells for "Y".  Margins and markups are really summary values derived from those basic facts.  Therefore, I've decided to store the final selling price in the database and to calculate the margins and markups when needed.<br /><br /><br />The second dilemma arises when actually trying to implement this approach in the user interface.  The users are going to be entering one of the calculated values, and the application will then back-calculate the "raw" data.  So how does this all work out in an object-oriented application?<br /><br /><br />Let's start with a Widget object that has costs for raw materials, production and finishing.  Each of those costs would be stored in the Widget table of the database along with its selling price.  The Widget object would have three read/write properties for raw materials, production and finishing costs along with a read-only property for a subtotal of those costs.  The selling price would be a read/write property, but it would only be written when loading up from the database; during the rest of the life of the Widget object, the selling price would be manipulated by assigning the Widget a new margin or markup.  (Depending on the development language, this might be enforced by the design of the Widget object and the scope of its member variables and properties; otherwise, it must rely on the restraint of the programmer to avoid directly assigning a selling price in the UI layer.) <br /><br /><br />The markup and margin percentages would be read/write properties, but there would be no local variable inside of the Widget object tracking these values.  Instead, they would be directly calculated (from the subtotal and selling price) whenever they need to be retrieved; whenever either of them is assigned, the Widget object would merely re-calculate the selling price and store that locally.<br /><br /><br />In an event-driven, conventional application, the user's changes to the onscreen values of the margin and markup would be immediately written to the in-memory Widget object, and the selling price would then be immediately updated onscreen.<br /><br /><br />In an asynchronous (e.g. web) application, it would involve just a bit of extra work in the user-interface layer.  Suppose a Widget object is loaded in memory and displayed onscreen.  There would be a place where the user could enter a new margin and a place where the user could enter a new markup.  When saving the user's changes, we wouldn't want to blindly write the onscreen values for both properties back into our Widget object.  If the user has changed the first one, but not the second, and we write both values back onto the Widget object, the user's changes (the first value) would get overwritten by the original values (the second one) with the net effect of no change at all to the selling price.  Only the changed value(s) would get written back onto the Widget object.<br /><br /><br />When storing into the database, only the values for the three costs and the final selling price would get written to the table, reflecting the core reality of this business transaction.  All the rest is summary and analysis.  ]]> 
  </content:encoded>
				<category>Database</category>
				<pubDate>Fri, 18 Nov 2005 23:26:03 GMT</pubDate>
				<guid>http://www.logicalimagination.com/drewsoniandisplay_blog.cfm?bid=C4990E05-BCD4-2240-D28DC9E99F16A2AD</guid>
			</item>
</channel></rss>