The source referenced by the OP has some credibility ...but what about Microsoft - what is the stance on struct usage? I sought some extra learning from Microsoft, and here is what I found:
Consider defining a structure instead of a class if instances of the
type are small and commonly short-lived or are commonly embedded in
other objects.
Do not define a structure unless the type has all of the following characteristics:
- It logically represents a single value, similar to primitive types (integer, double, and so on).
- It has an instance size smaller than 16 bytes.
- It is immutable.
- It will not have to be boxed frequently.
Microsoft consistently violates those rules
Okay, #2 and #3 anyway. Our beloved dictionary has 2 internal structs:
[StructLayout(LayoutKind.Sequential)] // default for structs
private struct Entry //<Tkey, TValue>
{
// View code at *Reference Source
}
[Serializable, StructLayout(LayoutKind.Sequential)]
public struct Enumerator :
IEnumerator<KeyValuePair<TKey, TValue>>, IDisposable,
IDictionaryEnumerator, IEnumerator
{
// View code at *Reference Source
}
*Reference Source
The 'JonnyCantCode.com' source got 3 out of 4 - quite forgivable since #4 probably wouldn't be an issue. If you find yourself boxing a struct, rethink your architecture.
Let's look at why Microsoft would use these structs:
- Each struct,
Entry
and Enumerator
, represent single values.
- Speed
Entry
is never passed as a parameter outside of the Dictionary class. Further investigation shows that in order to satisfy implementation of IEnumerable, Dictionary uses the Enumerator
struct which it copies every time an enumerator is requested ...makes sense.
- Internal to the Dictionary class.
Enumerator
is public because Dictionary is enumerable and must have equal accessibility to the IEnumerator interface implementation - e.g. IEnumerator getter.
Update - In addition, realize that when a struct implements an interface - as Enumerator does - and is cast to that implemented type, the struct becomes a reference type and is moved to the heap. Internal to the Dictionary class, Enumerator is still a value type. However, as soon as a method calls GetEnumerator()
, a reference-type IEnumerator
is returned.
What we don't see here is any attempt or proof of requirement to keep structs immutable or maintaining an instance size of only 16 bytes or less:
- Nothing in the structs above is declared
readonly
- not immutable
- Size of these struct could be well over 16 bytes
Entry
has an undetermined lifetime (from Add()
, to Remove()
, Clear()
, or garbage collection);
And ...
4. Both structs store TKey and TValue, which we all know are quite capable of being reference types (added bonus info)
Hashed keys notwithstanding, dictionaries are fast in part because instancing a struct is quicker than a reference type. Here, I have a Dictionary<int, int>
that stores 300,000 random integers with sequentially incremented keys.
Capacity: 312874
MemSize: 2660827 bytes
Completed Resize: 5ms
Total time to fill: 889ms
Capacity: number of elements available before the internal array must be resized.
MemSize: determined by serializing the dictionary into a MemoryStream and getting a byte length (accurate enough for our purposes).
Completed Resize: the time it takes to resize the internal array from 150862 elements to 312874 elements. When you figure that each element is sequentially copied via Array.CopyTo()
, that ain't too shabby.
Total time to fill: admittedly skewed due to logging and an OnResize
event I added to the source; however, still impressive to fill 300k integers while resizing 15 times during the operation. Just out of curiosity, what would the total time to fill be if I already knew the capacity? 13ms
So, now, what if Entry
were a class? Would these times or metrics really differ that much?
Capacity: 312874
MemSize: 2660827 bytes
Completed Resize: 26ms
Total time to fill: 964ms
Obviously, the big difference is in resizing. Any difference if Dictionary is initialized with the Capacity? Not enough to be concerned with ... 12ms.
What happens is, because Entry
is a struct, it does not require initialization like a reference type. This is both the beauty and the bane of the value type. In order to use Entry
as a reference type, I had to insert the following code:
/*
* Added to satisfy initialization of entry elements --
* this is where the extra time is spent resizing the Entry array
* **/
for (int i = 0 ; i < prime ; i++)
{
destinationArray[i] = new Entry( );
}
/* *********************************************** */
The reason I had to initialize each array element of Entry
as a reference type can be found at MSDN: Structure Design. In short:
Do not provide a default constructor for a structure.
If a structure defines a default constructor, when arrays of the
structure are created, the common language runtime automatically
executes the default constructor on each array element.
Some compilers, such as the C# compiler, do not allow structures to
have default constructors.
It is actually quite simple and we will borrow from Asimov's Three Laws of Robotics:
- The struct must be safe to use
- The struct must perform its function efficiently, unless this would violate rule #1
- The struct must remain intact during its use unless its destruction is required to satisfy rule #1
...what do we take away from this: in short, be responsible with the use of value types. They are quick and efficient, but have the ability to cause many unexpected behaviors if not properly maintained (i.e. unintentional copies).
Best Solution
You have restricted the purpose and applications of XSDs by making XSDs specific to Datasets in your question.
As an example lets say, you are heavily using XML files in your application which you may exchange with different types of remote sources. These sources may send you XML files in various formats. In your application you need to be sure to receive the XML file in proper format so that you can further perform your business operations on the XML file. So you need enforce standardization on to those XML file. You will need to validate the XML file against the acceptable standards at your end. You will need to compare the schema of XML with the standards. These standards are written in XSD form. And you will validate the schema of your XML file against the schema standards as defined in the XSD file. This is the actual purpose of the XSD files.
Now answering your questions..
1.) Do you think the XSD information should be located as part of the Model?
As I just sais XSD file stores the schema not the data. Same way in any application when you use Datasets which actually hold data in memory at runtime - will also have its own schema, the form in which it will hold the data. These varies based on the underlying Datatables and their relations. So MS guys introduced the concept of TypedDataSets. TypedDataSets - as the name suggests are qualified schema of the Dataset which you are going to use at run-time to play with the data. So TypedDataSets are actually defined in form of XSD file which defines the schema of DataTables and the relations inbetween. So when you create a TypedDataSet file in Visual studio, it basically creates an XSD file, All tables that you add from the database source to TypedDataSet surface, will be analyzed and metadata schema of each table will be created in the XSD file. At runtime when you select records into your dataset, you already know what kind of data is coming into them and if the data is not in the form as defined in the XSD you will get a runtime exception.
Still XSDs are not instrumental at runtime becuase Visual studio generates tpyed-dataset codebase from the XSD file by using XSD.exe tool.
2) Does it mean that Data Access Layer returns Datasets and other generated objects?
If Your data layer is using TypedDataset, It will return DataTables or DataRow[], or DataRow as you need.
3) Does it goes through all the system layers all the way to the UI?
You can generate custom business objects on top of it which is a recommended practice rather than throwing Dataset objects here and there in your application.
4) If the XSD is part of the Data Access Layer, should I convert the results to objects from the Model? What is best convert methodology?
Write a mapping mechanism using Reflection. We map our DataRow to Business object instances and DataTables to Business object Collections.
You can start re-designing to upscale your project with more maintainable architecture. And of course this will take time and effort but eventually you'll have great results.
This is what I have in my project.
1.) Application.Infrastructure
2.) Application.DataModel
3.) Application.DataAccess
4.) Application.DomainObjects
5.) Application.BusinessLayer
6.) Application.WebClient or Application.WindowsClient
Application.BusinessObjects are used across the application and they travel across all layers whenever neeeded [except Application.DataModel and Application.Infrastructure]
All my queries are defined only Application.DataModel.
Application.DataAccess returns or takes Business objects as part of any data-access operation. Business objects are created with the help of reflection attributes. Each business object is marked with an attribute mapping to target table in database and properties within the business object are marked with attributes mapping to target coloumn in respective data-base table.
My validation framework lets me validate each field with the help of designated ValidationAttribute.
My framrwork heavily uses Attributes to automate most of the tedious tasks like mapping and validation. I can also new feature as new aspect in the framework.
A sample business object would look like this in my application.
User.cs
BookCollection.cs