AEC companies trying to operationalize AI often find they lack the data foundation on which to build. There may be an abundance of data hidden in documents, but you can’t reliably use it for AI.
The lack of data quality was a key topic discussed at the AI in AEC 2026 conference. During the event, I met many experts working to solve this problem, including Pavlina Nikolova, Egnyte‘s EMEA AEC Practice Lead. The chat and her presentation highlighted the challenges and ways to overcome them.
Most AEC firms understand that AI is only as good as the data it works on. What they underestimate is how much work lies between their current state and a data environment in which AI can deliver meaningful results.
A presenter from a leading UK contractor described how it took two years to prepare quality incident data for AI processing. The dataset covered over a million records, more than 20 business units, and more than 12 years of project history. That is what AI readiness looks like in practice, before a single useful query was run.
The costly memory problem
As an architecture practitioner and R&D manager, I worked for companies that were painfully aware of the memory problem. Every project started afresh, and only a fraction of the previously accumulated assets and knowledge was available in a useful form. There were trials to remedy that, but they were overrun by project emergencies.
The advent of generative AI has made the need for usable information ever more imminent. Pavlina explained how customer data needs have increased eightfold over a five-year period. Larger files, cloud collaboration, and compliance requirements all contributed.
Project knowledge in AEC lives across multiple systems simultaneously. CRM platforms hold client relationships and proposal history. Estimating tools contain cost data and takeoff logic. Project management systems track drawings and submittals. Shared drives hold work-in-progress files. Email carries a significant portion of day-to-day decisions and approvals. Each system contains valuable information, but none is organized in a way that allows the information to be reused.
Research cited at the conference found that 5% of project costs go to rework, and around 80% of that rework stems from misplaced information or poor communication. On a 100-million-euro project, that is five million euros lost not to materials or labor but to organizational failure.
Three principles for turning data into knowledge
Data consolidation and data organization are the two responses Pavlina sees most frequently among firms trying to address this. Both are necessary, and neither is a quick fix.
The foundation is metadata, data about the contents and context of a document. In most AEC firms today, finding a file depends on knowing its name and naming logic, or remembering which folder it lives in. With proper metadata in place, teams can search by project characteristics, document type, date range, discipline, or any other relevant attribute. What matters is not what a file is called, but what it contains and what it relates to.
Classification builds on metadata. An AI-powered platform can recognize whether an uploaded file is a drawing, a contract, an invoice, or a submittal without requiring manual tagging. This matters because the volume of documents on a large construction project makes manual classification impractical.
The third principle is permissioning. Construction projects involve a large and shifting cast of stakeholders: internal teams, subcontractors, consultants, clients, and regulators. Who has access to what, at which stage of the project, is not a minor administrative question. It determines whether sensitive financial data, contractual terms, or proprietary design information stays where it belongs.
Bring AI to your data, not your data to AI
One of the cleaner formulations I heard at the conference came from Pavlina: bring AI to your data, not your data to AI. Many firms instinctively subscribe to whichever AI tool best handles a given task, one for contract analysis, another for estimation support, a third for site photo interpretation. The result is that data is extracted and sent outward at every step, which recreates the fragmentation problem in a new form and introduces data security risks that most firms have not fully considered.
The Building Safety Act in the UK now requires certain building types to retain project data for 30 years, with reliable access to it. That is one example of a compliance requirement that makes data portability across multiple disconnected AI platforms a serious liability. Other jurisdictions are moving in similar directions.
The practical implication is that firms should evaluate AI tools not just on their analytical capabilities, but on where and how they process data. A platform that brings AI capabilities to a governed, centralized data environment is architecturally different from a collection of point solutions that each require data to leave the environment.
What a practical solution looks like
Most of the problems described above are solvable with the right data infrastructure, and several platforms are moving in that direction. Egnyte’s offering for the AEC sector is one worth examining. It combines file collaboration, data governance, and security controls in a single platform, with AI capabilities built in rather than bolted on.
The Project Hub product within the platform directly addresses the lifecycle management problem. It ensures project knowledge is retrievable rather than archived by standardizing project setup, organizing documents from kickoff through closeout, and automatically applying consistent metadata and classification. The ROT management capability identifies redundant, outdated, and trivial data and addresses the quality problem before AI is applied.
AEC firms that are serious about AI readiness will need to make decisions about their data infrastructure that go beyond selecting an AI tool. Egnyte’s AEC content cloud is one platform worth examining as part of that process. You can find it at egnyte.com/industries/aec.
This article was produced in partnership with Egnyte.






